[BioC] grep GOID, R problem or my problem?

fhong@salk.edu fhong at salk.edu
Thu Nov 10 19:01:06 CET 2005


Hi Seth,

I am very thankful for your reply. I did changed my program a little to
avoid grep, and the problem seems being solved and the computational time
has been very much shorten. However, I still don't know what is going on
with grep, maybe "value=TRUE" is not appropriate here?

I attached the sample script ( the version still with "grep"). I run the
script in the last block several time, the output of the last command line
give me different numbers, sometimes 2490, sometimes 2492? Any idea?

I appreciate your help, I am sorry if my script is still too long.

Fangxin



########################################################
library(ath1121501)
library(GO)
library(annotate)

GOTerm2Tag=function(term)
{
  GTL=eapply(GOTERM,function(x) {
     grep(term,x at Term,value=TRUE)
  })
  Glen=sapply(GTL,length)
  names(GTL[Glen>0])
}
GO.trans=GOTerm2Tag("transcription")
##all GOID with "transcription" in its GO term

##AffyID.Ath1 is the affyID of all genes on ATH1 array
GO.Ath1=mget(AffyID.Ath1,ath1121501GO)
GOID.Ath1=sapply(GO.Ath1, function(x) {
    onts=sapply(x, function(z) z$GOID)
    })

GOID2Gene=function(GOID,GOID.Ath1)
{
   GOID.finder=sapply(GOID.Ath1,function(x) {
     grep(GOID,unlist(x),value=TRUE)
  })
  GOID.finder.len=sapply(GOID.finder,length)
  #names(which(GOID.finder.len>0))
  temp=which(GOID.finder.len>0)  ##output index
  names(temp)=names(which(GOID.finder.len>0))
  temp

}

###################################################
index.out=c()
for ( i in GO.trans)
{
  index.out=c(index.out,GOID2Gene(i,GOID.Ath1))
}
index.out.unique=unique(index.out)

length(index.out.unique)
#########################################################






> Hi Fangxin,
>
> On  9 Nov 2005, fhong at salk.edu wrote:
>
>> Hi list, Anyone tried to "grep" a target GOID from a list of GOIDs?
>> I found that it made random mistakes, meaning it might grep a GOID
>> which is not the target one by random.
>
> The random mistakes part seems... unlikely.  I ran the code you
> pasted, but it didn't demonstrate the issue very clearly.  Do you
> think you can simplify a bit and send a runnable example that will
> show us what you are seeing?
>
> Also, if you have a list of GOIDs, I wonder if match() or %in% might be a
> better choice than grep.  Or perhaps grep's fixed=TRUE option.  It
> seems you don't really want a regular expression, but an exact match.
>
>
> Best,
>
> + seth
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>


--------------------
Fangxin Hong  Ph.D.
Plant Biology Laboratory
The Salk Institute
10010 N. Torrey Pines Rd.
La Jolla, CA 92037
E-mail: fhong at salk.edu
(Phone): 858-453-4100 ext 1105



More information about the Bioconductor mailing list