[R] Help with Loops

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu May 13 17:12:30 CEST 2010


Hi,

On Thu, May 13, 2010 at 10:49 AM, Amit Patel <amitrhelp at yahoo.co.uk> wrote:
> Hi
>
> I have tried many attempts but cant get the loop right, as I am not a strong programmer. What I am basically trying to do is compare 2 spreadsheets. The problem is that one of them only contain a portion of the overall data (TESTSAMP), where the other has a full datasetFULLSAMP. From the complete set I would like to remove the rows of data which are not in the TESTSAMP. Column 1 contains the sample numbers which can be used to identify samples. Does anyone have any suggestions?
>
> I have tried various things like double loops and so on, but I am sure there is an easier way or function to do this.
>
> i tried this method, but Im not sure how to only keep looping until a match is found. I dont understand how repeat loops work in R.
>
> for (i in 1:length(FULLSAMP[,1])) {
>
> if (FULLSAMP[i,1] != TESTSAMP[i,1]) {
> FULLSAMP <- FULLSAMP[-i,]
> }

You want to not use for loops as much as possible.

Imagine your samples are identified as letters, so FULLSAMP[,1] will
be letters A..Z, and TESTSAMP[,1] will be some random 15 letters. Now
the job is to match the rows in TESTAMP to the rows in FULLSAMP, and
remove any "extra" rows in FULLSAMP that don' appear in testamp.

## Making some data
R> fullsamp <- data.frame(id=LETTERS, something=sample(1:100,
length(letters)), stringsAsFactors=FALSE)
R> testsamp <- data.frame(id=sample(LETTERS, 15),
something=sample(1:100, 15), stringsAsFactors=FALSE)

## Let's find where the "testamp" rows appear in "fullsamp"
R> xref <- match(testsamp[,1], fullsamp[,1])

## Now reduce fullsamp to have only the data corresponding to testsamp
## (and in the same order
R> fullsamp.sub <- fullsamp[xref,]

Notice that fullsamp.sub now has only rows with IDs appearing in
testsamp and they are also in the same order as testsamp.

Now go ahead and read the help you'll find in ?match

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list