Thanks for the code example Wolfgang,

The stochasticity suggests the problem is on the BioMart server side, I'll
contact them to see if they can look into it.

Regards,
Steffen

On Tue, Aug 7, 2012 at 2:08 AM, Wolfgang Huber <whuber@embl.de> wrote:

> Dear Steffen / List,
> below is a more compact code example that reproduces Tom's problem. I am
> rather confused by the fact that the problem seemed to occur stochastically!
>
> -------------------
> library(biomaRt)
> options(error=recover)
> ensembl = useMart("ensembl")
>
> human = useDataset("hsapiens_gene_**ensembl",mart=ensembl)
> attr = c('ensembl_gene_id','ensembl_**transcript_id',
>
>        'external_gene_id','**chromosome_name','strand','**
> transcript_start')
> bmres = getBM(attr, 'biotype', values = 'protein_coding', human)
>
> for(id in bmres[,"ensembl_transcript_id"**]){
>  sequence = getSequence(id=id, type='ensembl_transcript_id',
>
>                        seqType='transcript_flank',**upstream = 3000,
>                        mart = human)
>  sl = with(sequence, nchar(as.character(transcript_**flank)))
>  cat(id, sl, "\n")
> }
> -------------------
>
> One running this once, I got
> ...(lots of lines)
> ENST00000520540 3000
> ENST00000519310 3000
> ENST00000442920 3000
>
> Error in getBM(c(seqType, type), filters = c(type, "upstream_flank"),  :
>   Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank NOT
> FOUND
>
> The next time, the same error already occurred in the very first iteration
> of the for-loop, for id="ENST00000539570". The next time, in the third
> iteration for id="ENST00000510508".
>
> Any idea what is going on here?
>
>
> Further comments:
> - for *Steffen*: The documentation and the code of 'getSequence' do not
> seem to match each other (e.g. the description of argument 'seqType'),
> MySQL mode is mentioned but afaIu is not supported any more -> perhaps some
> maintenance would be nice to users.
> - for *Tom*: Making these queries (such as getSequence) within a for-loop
> is bad practice, since it needlessly clogs the network and the BioMart
> webservers. Please use R's vector-capabilities, e.g.
>
> ------------------------
> sequence = getSequence(id=bmres[,"**ensembl_transcript_id"],
>   type='ensembl_transcript_id', seqType='transcript_flank',
>
>   upstream = 3000, mart = human)
> sl = with(sequence, nchar(as.character(transcript_**flank)))
> -------------------------
>
> Best wishes
>         Wolfgang
>
>
> Tom Hait scripsit 08/06/2012 12:37 PM:
>
>  Hello,
>>
>> I'm a student in bioinformatics in Tel Aviv University.
>> I'm working with you biomaRt API in order to generate automatically FASTA
>> sequences downloading.
>> I experienced some problem, here is my code:
>>
>> #open biomart libaray
>> library(biomaRt)
>> #open data set of human
>> human = useDataset("hsapiens_gene_**ensembl",mart=ensembl)
>> #select the attributes that we want from the data set
>> attr<-c('ensembl_gene_id','**ensembl_transcript_id',
>> 'external_gene_id','**chromosome_name','strand','**transcript_start')
>> #downloading the map between transcript id and transcript name
>> tmpgene<-getBM(attr, 'biotype', values = 'protein_coding', human)
>> #save in a TSV format (the file is saved in txt)
>> write.table(tmpgene,"Z:/**tomhait/organisms/human/**
>> transcript_names.txt",
>> row.names=FALSE, quote=FALSE)
>> #collect all sequences with upstream flank 3000 bases based on the first
>> column (ensembl_id) of tmpgene
>> i<-1
>> for(id1 in tmpgene[,2]){
>>   #retrieve sequence
>>   sequence<-getSequence(id=id1,
>> type='ensembl_transcript_id',**seqType='transcript_flank',**upstream =
>> 3000,
>> mart = human)
>>   #check if sequence was retrieved
>>   sLengths <- with(sequence, nchar(as.character(transcript_**flank)))
>>
>> #writing to a new file in "Z:/tomhait/organisms/human/**
>> mart_export_new.txt"
>> #you can change it to "mart_export_new.txt" and it will create a new file
>> in R directory
>>   if(length(sLengths) > 0){
>>    x<-sequence[,1]
>>    y<-y<-strsplit(gsub("([[:**alnum:]]{60})", "\\1 ", x), " ")[[1]]
>>    title<-paste(paste(">",**tmpgene[i,1],sep=""),tmpgene[**
>> i,2],tmpgene[i,3],tmpgene[i,4]**,tmpgene[i,5],tmpgene[i,6],
>> sep="|")
>>    write(title,file="Z:/tomhait/**organisms/human/mart_export_**
>> new.txt",ncolumns
>> = 1, append=TRUE,sep="")
>>    write(y,file="Z:/tomhait/**organisms/human/mart_export_**new.txt",ncolumns
>> =
>> 1, append=TRUE,sep="\n")
>>    write("\n",file="Z:/tomhait/**organisms/human/mart_export_**
>> new.txt",ncolumns
>> = 1, append=TRUE,sep="\n")
>>   }
>>   i<-i+1
>> }
>>
>> I got the message:
>> Error in getBM(c(seqType, type), filters = c(type, "upstream_flank"),  :
>>    Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank
>> NOT
>> FOUND
>>
>> Could you please help me to solve this problem?
>>
>> Best Regards,
>>
>> Tom Hait.
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________**_________________
>> Bioconductor mailing list
>> Bioconductor@r-project.org
>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>> Search the archives: http://news.gmane.org/gmane.**
>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>
>>
>
> --
> Best wishes
>         Wolfgang
>
> Wolfgang Huber
> EMBL
> http://www.embl.de/research/**units/genome_biology/huber<http://www.embl.de/research/units/genome_biology/huber>
>
>
> ______________________________**_________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
> Search the archives: http://news.gmane.org/gmane.**
> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>

	[[alternative HTML version deleted]]

