[R] help sub setting data frame

Sean MacEachern sean.maceach at gmail.com
Thu Oct 22 23:39:58 CEST 2009


Hi,

I'm running into a problem subsetting a data frame that I have never
encountered before:

> dim(chkPd)
[1] 3213    6

> df = head(chkPd)
> df
               PN        WB      Sire     Dam   MG SEX
601      1001  715349   61710   61702   67    F
969  1001_1  511092 616253 615037 168    F
986  1002_1  511082 616253 623905 168    F
667      1003  715617   61817   61441   67    F
1361 1003_1 510711 635246 627321 168    F
754       1004 715272   62356   61380  67     F


> dfb = chkPd[df$PN,]
> dfb
            PN     WB   Sire    Dam  MG  SEX
1001    2114_1 510944 616294 614865 168    M
NA        <NA>     NA   <NA>   <NA>  NA <NA>
NA.1      <NA>     NA   <NA>   <NA>  NA <NA>
1003    1130_1 510950 616294 619694 168    F
NA.2      <NA>     NA   <NA>   <NA>  NA <NA>
1004 2221-SHR2 510952 616294 619694 168    M


I'm not sure why I'm getting this behaviour? By sub-setting the
original data frame by PN I seem to be pulling out row numbers?
Therefore I am only getting results where PN is less than the
dimensions of the original data frame and of course nothing where PN
has _ in the id. I have also tried using subset but haven't had any
luck with that either.


>dfb = subset(chkPd, PN==df$PN)
Warning message:
In PN == df$PN :
  longer object length is not a multiple of shorter object length

I wasn't aware that both the larger data frame had to be a multiple of
the object you were sub-setting . In any case I would appreciate any
insight into what I may be doing wrong.

Cheers,

Sean


> sessionInfo()
R version 2.9.1 (2009-06-26)
i386-apple-darwin8.11.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods   base




More information about the R-help mailing list