[R] Label rows of table by factor level for groups of factors

Sarah Goslee sarah.goslee at gmail.com
Tue Mar 6 19:27:23 CET 2012


Well, if you can get this to run your version of R is markedly\
different than mine.

> #Start of code
>
> x1=c(rep(0:1,6))
> x2=c(rep(c(1,1,0,0)6))
Error: unexpected numeric constant in "x2=c(rep(c(1,1,0,0)6"
> x3=c(rep(1,6),rep(0,6))



On Tue, Mar 6, 2012 at 1:23 PM, O'Hanlon, Simon J
<simon.ohanlon at imperial.ac.uk> wrote:
> Hi Sarah,
> Thanks a lot for your suggestion. I'll give it a go if I can (I just spent the last 3 hours using unique record filtering and vlookups in Excel to achieve what I'm sure can be accomplished in 3 or 4 lines of R code!).
>
> I think you might want to run the sample code again though. I just tried it (and there was no missing comma) and I get:
>
>   x1 x2 x3
> 1   0  1  1
> 2   1  1  1
> 3   0  1  1
> 4   1  1  1
> 5   0  1  1
> 6   1  1  1
> 7   0  1  0
> 8   1  1  0
> 9   0  1  0
> 10  1  1  0
> 11  0  1  0
> 12  1  1  0
>> tabledf
>  x1 x2 x3 Freq
> 1  0  1  0    3
> 2  1  1  0    3
> 3  0  1  1    3
> 4  1  1  1    3
>> desired
>   x1 x2 x3 res
> 1   0  1  1   3
> 2   1  1  1   4
> 3   0  1  1   3
> 4   1  1  1   4
> 5   0  1  1   3
> 6   1  1  1   4
> 7   0  1  0   1
> 8   1  1  0   2
> 9   0  1  0   1
> 10  1  1  0   2
> 11  0  1  0   1
> 12  1  1  0   2
>> nrow(tabledf)
> [1] 4
>> dim(tabledf)
> [1] 4 4
>
> #Start of code
>
> x1=c(rep(0:1,6))
> x2=c(rep(c(1,1,0,0)6))
> x3=c(rep(1,6),rep(0,6))
> df=data.frame(x1,x2,x3)
> tabledf=as.data.frame(with(df, table(x1,x2,x3)))
> res=c(3,4,3,4,3,4,1,2,1,2,1,2)
> desired=data.frame(x1,x2,x3,res)
> df
> tabledf
> desired
>
> #End of code
>
> Cheers!
>
> Simon
>
> --------------------------------
> Simon O'Hanlon, BSc MSc
> Department of Infectious Disease Epidemiology
> Imperial College London
> St. Mary's Hospital
> London
> W2 1PG
> ________________________________________
> From: Sarah Goslee [sarah.goslee at gmail.com]
> Sent: 06 March 2012 18:16
> To: O'Hanlon, Simon J
> Cc: r-help at R-project.org
> Subject: Re: [R] Label rows of table by factor level for groups of factors
>
> One possible approach is to use unique() to get the list of distinct
> combinations, cbind() an identifying variable to that list, then use
> merge() to join it to your existing data frame.
>
> But I'm not seeing how you are getting four unique combinations.
> Given your sample data (with the missing comma replaced):
>> dim(tabledf)
> [1] 8 4
>> head(desired)
>  x1 x2 x3 res
> 1  0  1  1   3
> 2  1  1  1   4
> 3  0  0  1   3
> 4  1  0  1   4
> 5  0  1  1   3
> 6  1  1  1   4
>
> tabledf has 8 rows, not 4, and I don't see how rows 1 and 3
> or rows 2 and 4 of your desired df should get the same
> classification.
>
> Regardless, if you can make a data frame like tabledf with
> an additional column for your desired res variable, you can
> merge() it with your original data frame.
>
> Sarah
>
> On Tue, Mar 6, 2012 at 11:06 AM, O'Hanlon, Simon J
> <simon.ohanlon at imperial.ac.uk> wrote:
>> Dear useRs,
>> I am sure this is a fairly simple problem, but I just cannot get my head around it.
>>
>>
>> I have a dataframe which contains several factor variables. I can use table() to tell me how many different combinations there are of these variables. What I should like to do is to add a column to my original dataframe which labels each row according to the unique combination of factors.
>>
>>
>> E.g. in the simple example below I create a dataframe 'df' with 3 columns, the values of which take 0 or 1. I can then classify each row in the table and I find that I have 4 unique combinations of factors. I would now like to add a fourth column to df which labels each row according to whether it was unique combination 1,2,3 or 4:
>>
>> x1=c(rep(0:1,6))
>> x2=c(rep(c(1,1,0,0)6))
>> x3=c(rep(1,6),rep(0,6))
>> df=data.frame(x1,x2,x3)
>> tabledf=as.data.frame(with(df, table(x1,x2,x3)))
>> res=c(3,4,3,4,3,4,1,2,1,2,1,2)
>> desired=data.frame(x1,x2,x3,res)
>> df
>> tabledf
>> desired
>>
>>
>> I realise that this is probably quite simple to do, I am just struggling to get my head around it! Help much appreciated in advance.
>>



More information about the R-help mailing list