[BioC] Is a number within a set of ranges?
Oleg Sklyar
osklyar at ebi.ac.uk
Mon Oct 29 22:01:05 CET 2007
It's about both, and in fact after scrolling down I noticed that we came
up with exactly the same solution :)
-
Dr Oleg Sklyar * EMBL-EBI, Cambridge CB10 1SD, UK * +441223494466
On Mon, 2007-10-29 at 16:44 -0400, James W. MacDonald wrote:
> In this case you don't gain much if anything by using apply(), which is
> just a nice wrapper to a for() loop (and the bad rap that for loops have
> in R isn't really applicable these days).
>
> The real gain to be had is from vectorizing the comparison.
>
> Best,
>
> Jim
>
>
>
> Oleg Sklyar wrote:
> > You would like to avoid loops here, especially nested loops: this is
> > what apply, sapply etc are for. Using your syntax:
> >
> > final.presence = apply(gene, 1, function(x) any(x[2]>=place$start &
> > x[2]<=place$end))
> >
> > -
> > Dr Oleg Sklyar * EMBL-EBI, Cambridge CB10 1SD, UK * +441223494466
> >
> >
> > On Mon, 2007-10-29 at 12:42 -0500, Artur Veloso wrote:
> >> Hi Daniel,
> >>
> >> I'm very new to R and I'm far from a good programmer, but I think that this
> >> small script should solve your problem. Well, at least for the example you
> >> provided it worked. I hope it helps.
> >>
> >> Cheers,
> >>
> >> Artur
> >>
> >>> start <- c(1,5,13)
> >>> stop <- c(3,9,15)
> >>> place <- data.frame(start,stop)
> >>>
> >>> gene <- c(1,2,3,4)
> >>> position <- c(14,4,10,6)
> >>> position <- data.frame(gene,position)
> >>>
> >>> range <- list()
> >>> for(a in 1:dim(place)[1])
> >> + range[[a]] <- seq(place$start[a],place$stop[a])
> >>> presence <- NULL
> >>> final.presence <- NULL
> >>> for(b in position$position)
> >> + {
> >> + for(c in 1:length(range))
> >> + {
> >> + presence <- c(presence,b%in%range[[c]])
> >> + }
> >> + final.presence <- c(final.presence,as.logical(sum(presence)))
> >> + presence <- NULL
> >> + }
> >>> position[final.presence,]
> >> gene position
> >> 1 1 14
> >> 4 4 6
> >>
> >>
> >> On 10/29/07, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote:
> >>> I have a table with a start and stop column which defines a set of
> >>> ranges. I have another table with a list of genes with associated
> >>> position. What I would like to do is subset the gene table so it only
> >>> contains genes whose position is within any of the ranges. What is the
> >>> best way to do this? The only way I can think of is to construct a long
> >>> list of conditions linked by ORs but I am sure there must be a better way.
> >>>
> >>> Simple example:
> >>>
> >>> Start Stop
> >>> 1 3
> >>> 5 9
> >>> 13 15
> >>>
> >>> Gene Position
> >>> 1 14
> >>> 2 4
> >>> 3 10
> >>> 4 6
> >>>
> >>> I would like to get out:
> >>> Gene Position
> >>> 1 14
> >>> 4 6
> >>>
> >>> Any ideas?
> >>>
> >>> Thanks
> >>>
> >>> Dan
> >>>
> >>> --
> >>> **************************************************************
> >>> Daniel Brewer, Ph.D.
> >>> Institute of Cancer Research
> >>> Email: daniel.brewer at icr.ac.uk
> >>> **************************************************************
> >>>
> >>> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
> >>> Company Limited by Guarantee, Registered in England under Company No. 534147
> >>> with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
> >>>
> >>> This e-mail message is confidential and for use by the...{{dropped:13}}
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list