[R] A smart way to use "$" in data frame

Duncan Murdoch murdoch.duncan at gmail.com
Fri Jan 18 21:04:37 CET 2013


On 18/01/2013 2:40 PM, Yuan, Rebecca wrote:
> Hello all,
>
> I have a data frame dataa:
>
> newdate newstate newid newbalance newaccounts
> 1 31DEC2001        AR         1                 1170           61
> 2 31DEC2001        VA         2                  4565           54
> 3 31DEC2001        WA         3                 2726           35
> 4 31DEC2001        AR         3                 2700           35
>
> The following gives me the balance of state AR:
>
> dataa$newbalance[data$newstate == 'AR']
> 1170
> 2700
>
> Now, I have another different data frame datab, it is very similar to data, except that the name of the columns are different, and the order of the columns are different:
>
> oldstate olddate oldbalance oldid oldaccounts
> 1 AR       31DEC2012        1234         7              40
> 2 WA     31DEC2012        2222         3              30
> 3 VA       31DEC2012        2345         5              23
> 3 AR       31DEC2012        5673         5              23
>
> datab$oldbalance[datab$oldstate== 'AR' ]
> 1234
> 5673
>
> Could I have a way to quote
>
> data$balance[data$state == 'AR']
>
> in general, where balance=oldbalance, state=oldstate when data=dataa, and balance = newbalance, state = newstate when data=datab ?

Yes, you could use the name of the dataframe to get the column names, e.g.

state <- c(dataa="oldstate", datab="newstate")
balance <- c(dataa="oldbalance", datab="newbalance")
dfname <- "dataa"
df <- get(dfname)
df[ df[,state[dfname]] == 'AR', balance[dfname]]

but that is really, really unreadable code.  You would be much better 
off to name the columns consistently in the first place.

Duncan Murdoch



More information about the R-help mailing list