[R] Re ferencing columns and pulling selected data
Brian Diggs
diggsb at ohsu.edu
Wed Aug 5 22:16:11 CEST 2009
PDXRugger wrote:
> Please consider the following inputs:
> PrsnSerialno<-c(735,1147,2019,4131,4131,4217,4629,4822,4822,5979,5979,6128,6128,7004,7004,
> 7004,7004,7004,7438,7438,9402,9402,9402,10115,10115,11605,12693,12693,12693)
>
> PrsnAge<-c(59,48,42,24,24,89,60,43,47,57,56,76,76,66,70,14,7,3,62,62,30,10,7,20,21,50,53,44,29)
>
> IsHead<-c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,FALSE,TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,
> FALSE,TRUE,FALSE,TRUE,FALSE,FALSE,FALSE,TRUE,FALSE,TRUE,FALSE,FALSE)
>
> PrsnData<-cbind(PrsnSerialno,PrsnAge,IsHead)
This is more easily dealt with using data.frames than matrices (which is what cbind will give you; also a data.frame will not promote your logical IsHead to numeric in the process)
PrsnData<-data.frame(PrsnSerialno,PrsnAge,IsHead)
> HhSerialno<-c(735,1147,2019,4131,4217,4629,4822,5979,6128,7004,7438,9402,10115,11605,12693)
> HhData<-cbind(HhSerialno)
Same for HhData:
HhData<-data.frame(HhSerialno)
> What i would like to do is to add a age column to HhData that would
> correspond to the serial number and which is also the oldest person in the
> house, or what corresponds to "TRUE"(designates oldest person). The TRUE
> false doesnt have to be considered but is preferable.
>
> The result would then be:
> HhSerialno HhAge
> 735 59
> 1147 48
> 2019 42
> 4131 24
> 4217 89
> 4629 60
> 4822 47
> 5979 57
> 6128 76
> 7004 70
> 7438 62
> 9402 30
> 10115 21
> 11605 50
> 12693 53
>
> I tried
> PumsHh..$Age<-PumsPrsn[PumsPrsn$SERIALNO==PumsHh..$Serialno,PumsPrsn$AGE]
> but becaseu teh data frames are of different length it doesnt work so im
> unsure of another way of doing this. Thanks in advance
merge will pull together two data.frames based on some matching criteria without regard to if they are the same length.
HhData <- merge(HhData,
PrsnData[PrsnData$IsHead==TRUE,
c("PrsnSerialno","PrsnAge")],
by.x = "HhSerialno",
by.y = "PrsnSerialno")
That is, merge the data.frame HhData with the a selected subset of PrsnData (those cases with IsHead == TRUE and only the columns with the serial number and age). Since the variable names that are to be matched are not the same in the two data.frames, by.x and by.y must be specified.
names(HhData)[2] <- "HhAge"
This will change the variable name from PrsnAge (which it inherited from PrsnData) to HhAge.
HhData
HhSerialno HhAge
1 735 59
2 1147 48
3 2019 42
4 4131 24
5 4217 89
6 4629 60
7 4822 47
8 5979 57
9 6128 76
10 7004 70
11 7438 62
12 9402 30
13 10115 21
14 12693 53
> JR
>
--
Brian Diggs, Ph.D.
Senior Research Associate, Department of Surgery, Oregon Health & Science University
More information about the R-help
mailing list