[BioC] BiomaRt error: ncol(result) == length(attributes) is not TRUE

Wolfgang Huber huber at ebi.ac.uk
Thu Mar 27 19:35:51 CET 2008


Dear Quin,

In general it is sufficient the send the same mail only once to this
list, there is no added benefit in looping over the send button, and
indeed might collect you bad karma from all the people who have to clean
their mailboxes.

The manual page of the "getGene" function says

Usage:
     getGene( id, type, mart)

Arguments:
      id: *vector* of gene identifiers one wants to annotate

I am not sure how this could be more clear. R is a well-developed and
powerful language and many people have benefited from reading
introductions such as this one (on CRAN):
http://www.stats.bris.ac.uk/R/doc/manuals/R-intro.html

-- 
Best wishes
 Wolfgang

------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber



27/03/2008 18:19 Quin Wills a écrit
> Thank you Steffen for the really quick reply.
> 
> Out of interest, I tried using Sys.sleep() but I still get the same 
> problem. www.ensembl.org is also running quite slowly this side for 
> queries, and I wonder if that might be somehow related.
> 
> Sorry if this is in the manual - I didn't spot it. How many queries in a 
> batch do you think one should avoid going over? I've a fairly long list 
> of identifiers.
> 
> Quin
> 
> 
> Steffen wrote:
>> Hi Quin,
>>
>> How long is your list of identifiers?  It is not recommended to run a 
>> query like this in loops as this causes the web service to go out of 
>> sync at some point during the loop.
>> biomaRt is made to perform batch queries.
>> I would recommend to do your query as follows:
>>
>> genes <- getGene(id=ID, type="refseq_dna", mart=ensembl)
>>
>> This will give you a dataframe with the info for all the genes. If 
>> needed you can then loop over the result.
>> If you feel like you really need to loop you could add Sys.sleep(1) 
>> in  the loop.
>>
>> Cheers,
>> Steffen
>>
>> Quin Wills wrote:
>>> Hello all
>>>
>>> I'm running the most up to date R and biomaRt.
>>>
>>> I get the following error:
>>>  >Error: ncol(result) == length(attributes) is not TRUE
>>>
>>> for the following loop:
>>> # 'ID' is a character vector of refseq IDs
>>> #'gene', for the purposes of the argument here, is a list storing the 
>>> output
>>>
>>>  > ensembl <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
>>>  > for (i in 1:length(ID)) {
>>>  >       gene[[i]] <- getGene(id=ID[i], type="refseq_dna", mart=ensembl)
>>>  > }
>>>
>>> The problem is not dependent on the get function used or the id type 
>>> used. I didn't have this problem yesterday on the same script. The 
>>> error also occurs randomly, breaking the loop at any particular 
>>> point, sometimes allowing thousands of loops to run.
>>>
>>> Could this be a problem with the server I'm pulling the information 
>>> from? It just seems too random to be my coding - especially 
>>> considering I didn't have this problem yesterday.
>>>
>>> I've had this before, ages ago, and would like to get to the bottom 
>>> of it. And wisdom? Thanks.
>>>  
>>>
>>> * *
>>>
>>> * *
>>>
>>> * *
>>>
>>> *Quin Wills*
>>> *DPhil candidate*
>>>
>>> * *
>>>
>>> *Department of Statistics*
>>>
>>> *University** of Oxford***
>>>
>>> *1 South Parks Road*
>>> *Oxford***
>>>
>>> *OX1 3TG
>>> United Kingdom*
>>>
>>>  
>>>
>>> *01865 285 394*
>>>
>>>
>>>     [[alternative HTML version deleted]]



More information about the Bioconductor mailing list