[biomaRt] Query ERROR: caught BioMart::Exception::Usage: Attributes from multiple attribute pages...

created at 01-01-2022 views: 27

error

Query ERROR: caught BioMart::Exception::Usage: Attributes from multiple attribute pages are not allowed

As the error says, attributes from multiple attribute pages are set.

for example:

I have an exon with an id of ENSE00001706048, query its corresponding gene id:

## Set up the database and data set
human <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl", mirror = "asia")

results <- getBM(
     attributes= c("ensembl_gene_id", "external_gene_name", "ensembl_exon_id"),
     filters=c("ensembl_exon_id"),
     values="ENSE00001706048", mart=human)
> results
  ensembl_gene_id external_gene_name ensembl_exon_id
1 ENSG00000188554               NBR1 ENSE00001706048

When we still want to know the start and end positions of exon, add two attributes:

results <- getBM(
    attributes= c("ensembl_gene_id", "external_gene_name","ensembl_exon_id", 
                  "exon_chrom_start", "exon_chrom_end"),
    filters=c("ensembl_exon_id"),
    values="ENSE00001706048", mart=human)

It can also get the results we want normally:

> results
  ensembl_gene_id external_gene_name ensembl_exon_id exon_chrom_start exon_chrom_end
1 ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608

Furthermore, if you still want to know which GO term corresponds to gene, try adding go_id, this attribute.

results <- getBM(
    attributes= c("ensembl_gene_id", "external_gene_name", "ensembl_exon_id", 
                  "exon_chrom_start", "exon_chrom_end", "go_id"),
    filters=c("ensembl_exon_id"),
    values="ENSE00001706048", mart=human)

Unfortunately, it reported an error

Error in .processResults(postRes, mart = mart, hostURLsep = sep, fullXmlQuery = fullXmlQuery,  : 
  Query ERROR: caught BioMart::Exception::Usage: Attributes from multiple attribute pages are not allowed

Let's check the attributes we set,

e_attrs <- c("ensembl_gene_id", "external_gene_name", "ensembl_exon_id",  "exon_chrom_start", "exon_chrom_end", "go_id")

listAttributes(human)[listAttributes(human)$name %in% e_attrs, ]

data

ensembl_gene_id, external_gene_name, ensembl_exon_id, exon_chrom_start, exon_chrom_end all belong to the page of structure, and under the page of feature_page, there is go_id, but there is no exon_chrom_start, "exon_chrom_end".

So just as the error said, attributes from multiple attribute pages are set. exon_chrom_start, exon_chrom_end and go_id are mixed together to report an error.

Solution

Separate the query, then merge.

results1 <- getBM(
    attributes= c("ensembl_gene_id", "external_gene_name", "ensembl_exon_id", 
                  "exon_chrom_start", "exon_chrom_end"),
    filters=c("ensembl_exon_id"),
    values="ENSE00001706048", mart=human)

results2 <- getBM(
  attributes= c("ensembl_gene_id", "external_gene_name", "ensembl_exon_id", "go_id"),
  filters=c("ensembl_exon_id"),
  values="ENSE00001706048", mart=human)

merge(results1, results2)
> merge(results1, results2)
   ensembl_gene_id external_gene_name ensembl_exon_id exon_chrom_start exon_chrom_end      go_id
1  ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608 GO:0008270
2  ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608 GO:0005515
3  ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608 GO:0043130
4  ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608 GO:0000407
5  ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608 GO:0016236
...........
23 ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608 GO:0051019
24 ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608 GO:0032872
25 ENSG00000188554               NBR1 ENSE00001706048         43200167       43200608 GO:0005758

other

The listAttributes function can list the attributes returned by the query, and the listFilters can list the attributes that can be used for filtering

> ensembl <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl", mirror = "asia")
> listAttributes(ensembl)
                           name                  description         page
1               ensembl_gene_id               Gene stable ID feature_page
2       ensembl_gene_id_version       Gene stable ID version feature_page
3         ensembl_transcript_id         Transcript stable ID feature_page
4 ensembl_transcript_id_version Transcript stable ID version feature_page
5            ensembl_peptide_id            Protein stable ID feature_page
6    ensembl_peptide_id_version    Protein stable ID version feature_page
..........
..........


> listFilters(ensembl)
                name                            description
1    chromosome_name               Chromosome/scaffold name
2              start                                  Start
3                end                                    End
4             strand                                 Strand
5 chromosomal_region e.g. 1:100:10000:-1, 1:100000:200000:1
......
.....

And biomaRt, it’s a good thing, it reminds me that my request is timed out...

Error in curl::curl_fetch_memory(url, handle = handle) : 
  Timeout was reached: [asia.ensembl.org:443] Connection timed out after 10001 milliseconds
created at:01-01-2022
edited at: 01-01-2022: