[R] problem subsetting data frame with variable instead of constant

Sarah Goslee sarah.goslee at gmail.com
Fri Feb 10 17:28:09 CET 2012


This is likely a representation issue, as in R FAQ 7.31.

?"==" suggests that using identical and all.equal is a better strategy:

     x1 <- 0.5 - 0.3
     x2 <- 0.3 - 0.1
     x1 == x2                           # FALSE on most machines
     identical(all.equal(x1, x2), TRUE) # TRUE everywhere

Sarah

On Fri, Feb 10, 2012 at 11:15 AM, vaneet <vaneet.lotay at mountsinai.org> wrote:
> Hello,
>
> I've encountered a very weird issue with the method subset(), or maybe this
> is something I don't know about said method that when you're subsetting
> based on the columns of a data frame you can only use constants (0.1, 2.3,
> 2.2) instead of variables?
>
> Here's a look at my data frame called 'ea.cad.pwr':
> *>ea.ca.pwr[1:5,]
>   MAF   OR  POWER
> 1 0.02 0.01 0.9999
> 2 0.02 0.02 0.9998
> 3 0.02 0.03 0.9997
> 4 0.02 0.04 0.9995
> 5 0.02 0.05 0.9993*
>
> Here's my subset lines which finds no rows:
>
> *power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds)
> power2 = subset(ea.cad.pwr, MAF == maf2 & OR == odds)
> *
> Now when maf1 = 0.2 and odds = 1.2 it finds nothing.  I know for a fact that
> there's a row with these values:
> *> ea.cad.pwr[1430:1432,]
>     MAF   OR  POWER
> 1430 0.2 0.58 0.9996
> 1431 0.2 1.20 0.3092
> 1432 0.2 1.22 0.3914*
>
> I have code working in a loop and each previous iteration the subset()
> function is working fine, but in this iteration some different lines are
> executed which are relevant to these variables, here they are:
> *
> maf1 = maf.adj - 0.01
> maf2 = maf.adj + 0.01*
>
> Basically maf.adj is always a 2 decimal number (in this case = 0.21), and
> I'm computing the numbers around it by a difference of 0.01 (0.2,0.22) in
> case maf.adj isn't in the table.  maf.adj is read from another dataframe,
> when I use it to subset it always works fine but when I do this innocent
> subtraction for some reason it doesn't work.  If I rewrite statements like
> this it works:
>
> *power1 = subset(ea.cad.pwr, MAF == 0.2 & OR == odds)
> power2 = subset(ea.cad.pwr, MAF == 0.22 & OR == odds)
> *
>
> Even if I write this first:
>
> maf1 = 0.2
>
> Then:
>
> power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds)
>
> It works as well! That's what's really confusing, how can this subtraction
> mess everything up?  Please help if you can..thank you!
>
> Vaneet
>


-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list