[BioC] Is a number within a set of ranges?
James W. MacDonald
jmacdon at med.umich.edu
Mon Oct 29 21:44:55 CET 2007
In this case you don't gain much if anything by using apply(), which is
just a nice wrapper to a for() loop (and the bad rap that for loops have
in R isn't really applicable these days).
The real gain to be had is from vectorizing the comparison.
Best,
Jim
Oleg Sklyar wrote:
> You would like to avoid loops here, especially nested loops: this is
> what apply, sapply etc are for. Using your syntax:
>
> final.presence = apply(gene, 1, function(x) any(x[2]>=place$start &
> x[2]<=place$end))
>
> -
> Dr Oleg Sklyar * EMBL-EBI, Cambridge CB10 1SD, UK * +441223494466
>
>
> On Mon, 2007-10-29 at 12:42 -0500, Artur Veloso wrote:
>> Hi Daniel,
>>
>> I'm very new to R and I'm far from a good programmer, but I think that this
>> small script should solve your problem. Well, at least for the example you
>> provided it worked. I hope it helps.
>>
>> Cheers,
>>
>> Artur
>>
>>> start <- c(1,5,13)
>>> stop <- c(3,9,15)
>>> place <- data.frame(start,stop)
>>>
>>> gene <- c(1,2,3,4)
>>> position <- c(14,4,10,6)
>>> position <- data.frame(gene,position)
>>>
>>> range <- list()
>>> for(a in 1:dim(place)[1])
>> + range[[a]] <- seq(place$start[a],place$stop[a])
>>> presence <- NULL
>>> final.presence <- NULL
>>> for(b in position$position)
>> + {
>> + for(c in 1:length(range))
>> + {
>> + presence <- c(presence,b%in%range[[c]])
>> + }
>> + final.presence <- c(final.presence,as.logical(sum(presence)))
>> + presence <- NULL
>> + }
>>> position[final.presence,]
>> gene position
>> 1 1 14
>> 4 4 6
>>
>>
>> On 10/29/07, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote:
>>> I have a table with a start and stop column which defines a set of
>>> ranges. I have another table with a list of genes with associated
>>> position. What I would like to do is subset the gene table so it only
>>> contains genes whose position is within any of the ranges. What is the
>>> best way to do this? The only way I can think of is to construct a long
>>> list of conditions linked by ORs but I am sure there must be a better way.
>>>
>>> Simple example:
>>>
>>> Start Stop
>>> 1 3
>>> 5 9
>>> 13 15
>>>
>>> Gene Position
>>> 1 14
>>> 2 4
>>> 3 10
>>> 4 6
>>>
>>> I would like to get out:
>>> Gene Position
>>> 1 14
>>> 4 6
>>>
>>> Any ideas?
>>>
>>> Thanks
>>>
>>> Dan
>>>
>>> --
>>> **************************************************************
>>> Daniel Brewer, Ph.D.
>>> Institute of Cancer Research
>>> Email: daniel.brewer at icr.ac.uk
>>> **************************************************************
>>>
>>> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
>>> Company Limited by Guarantee, Registered in England under Company No. 534147
>>> with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>>>
>>> This e-mail message is confidential and for use by the...{{dropped:13}}
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
More information about the Bioconductor
mailing list