[BioC] [biocpkgs] suggestions on package matchprobes
lgautier at altern.org
lgautier at altern.org
Fri Sep 22 12:36:52 CEST 2006
> Thanks a lot!
>
> Maybe you can remind users that the matching is case-sensitive. For DNA
sequences, people might tend to treat the lowercase and the uppercase
the same.
> I was looking at the library "altcdfenvs" to create an alternative CDF
environment. I do not see a straightforward connection between
> "altcdfenvs" and "Biostrings". Any suggestions?
You raise a good point here. I would have liked to use something generic
for DNA/RNA sequence but "Biostrings" was in its infancy when "altcdfenvs"
was written... and now because of my current occupation further work on
the package is unlikely to happen soon..
Low-energy approaches would be to either write a function that transforms
a a list of 'BStringViews' to a list such as the one returned by
'matchprobes' and feed it to 'buildCdfEnv.matchprobes' (as the vignette
and documentation for 'buildCdfEnv.matchprobes' indicate), or to modify
'buildCdfEnv.matchprobes' to accept a list of 'BStringViews' as input
(and in that last case you will mostly only have to work on the following
code:
xy <- getxy.probeseq(probeseq=probe.tab, i.row=matches$match[[i]],
x.colname = x.colname, y.colname = y.colname)
)
Hoping this helps,
Laurent
> BTW, my previous reply was held because this email address was not
subscribed to this list. Now it should work.
>
> Best,
> Xinxia
>
> -----Original Message-----
> From: Wolfgang Huber [mailto:huber at ebi.ac.uk]
> Sent: Thursday, September 14, 2006 2:31 AM
> To: Robert Gentleman
> Cc: Xinxia Peng; Bioconductor
> Subject: Re: [BioC] [biocpkgs] suggestions on package matchprobes
>
> Hi Xinxia,
>
> thanks!
> 1. The problem with the cases was simple: the function 'matchprobes'
calls C code to do the actual work, and it was:
>
> matchprobes <- function(query, records, probepos=FALSE)
> .Call("MP_matchprobes", toupper(query), records, probepos,
> PACKAGE="matchprobes")
>
> I removed the "toupper" in matchprobes_1.5.1, this should make you
happier. There is no good reason why it should have been there, and that
it was not documented was a bug. So now it is gone.
>
> 2. As Robert said, for generic sequence matching please use
> "Biostrings", that is much better. "matchprobes" only still exists for
backward compatibility.
>
> Best wishes
> Wolfgang
>
>
> Robert Gentleman wrote:
>> Please ask these sorts of questions on the Bioconductor mailing list -
>
>> redirected there
>> and for generic sequence matching Biostrings is a better tool - we will
look into this, thanks Robert
>> Xinxia Peng wrote:
>>> =+=+=+=+=+=+=+=+=+ biocpkgs mailing list +=+=+=+=+=+=+=+=+= Dear Bioc
>
>>> Team,
>>> It appears that the function 'matchprobes' will not work with
> sequences in lower case. Also it might be nice not to match empty
string. See the following example:
>>>> test.seq
>>> [1]
> "atggcggcgcaaagtagtggtgggggtggaggttgtggtgaggaagataaagatgccaaatatatgtttga
taggatagggaaagaagtgcacgacgaag"
>>> [2]
> "atgaaaagggtaatgcaacaatttgtggatcgtacaacacaacgatttcacgaatatgatgaaaggatgaa
aactacacgccaaaaatgtaaagaacgat"
>>> [3]
> "atgaaacttcactgctctaaaatattattatttttacttccattaaatatattagtaacatcattatcaaa
tgtgcataataataataaactatacaaca"
>>> [4]
> "atgaaagtccattatattaatatattattgtttgctcttccattaaatatattggaacataataaaaatga
accacacaccacaccaaatcatacacaaa"
>>> [5]
> "atgtttacaacaaaaaaaaaaattaaatatattataattatatgtggcatctttcgaaaatatttcaaatt
cggaagaattattgaggttccaatgatgc"
>>> [6]
> "atgaaactgcactactctaatatattattatttttctttccattaaatatattagtaacatcatatcatgt
atataataaaaataaaatatacatcacac"
>>> [7]
> "atgtgtgctattggagaattactatcatctacagataaggaatatactcttaatttctttggtttagttaa
agatggagcatcgattgatgaaatgaaag"
>>> [8]
> "atgattaagatgaaattccattatgtaggatattattctgaagaagaaaatatgaaaaatacactgaaaat
ttgttccgttagacaaatatttttaaatt"
>>> [9]
> "atgttattatttgctttattatttaatgcacttttattatcacaaaatgtaaattgccgaaacaacaatta
taatataagattcactcaaacgataacac"
>>> [10]
> "atgatataccacagaaggattatagcttatctcataaatcatctaccattaggtatatcccttacagaagt
ggtcgatataaatgaagaacatatattta"
>>>> test.p
>>> [1] "atggcggcgcaaagtagtggtgggg"
>>>> matchprobes(test.seq, test.p)
>>> $match
>>> $match[[1]]
>>> numeric(0)
>>> $match[[2]]
>>> numeric(0)
>>> $match[[3]]
>>> numeric(0)
>>> $match[[4]]
>>> numeric(0)
>>> $match[[5]]
>>> numeric(0)
>>> $match[[6]]
>>> numeric(0)
>>> $match[[7]]
>>> numeric(0)
>>> $match[[8]]
>>> numeric(0)
>>> $match[[9]]
>>> numeric(0)
>>> $match[[10]]
>>> numeric(0)
>>>> matchprobes(toupper(test.seq), toupper(c(test.p, "")))
>>> $match
>>> $match[[1]]
>>> [1] 1 2
>>> $match[[2]]
>>> [1] 2
>>> $match[[3]]
>>> [1] 2
>>> $match[[4]]
>>> [1] 2
>>> $match[[5]]
>>> [1] 2
>>> $match[[6]]
>>> [1] 2
>>> $match[[7]]
>>> [1] 2
>>> $match[[8]]
>>> [1] 2
>>> $match[[9]]
>>> [1] 2
>>> $match[[10]]
>>> [1] 2
>>> Thanks,
>>> Xinxia Peng
>>> Seattle Biomedical Research Institute
>>> __________________________________________________________________
biocpkgs mailing list
>>> To unsubscribe from this mailing list send a blank email to
>>> biocpkgs-leave at lists.fhcrc.org You can also unsubscribe or change your
personal options at
>>> http://lists.fhcrc.org/mailman/listinfo/biocpkgs
>
>
> --
> ------------------------------------------------------------------
Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list