[R] Odp: how to subset unique factor combinations from a data frame.

Petr PIKAL petr.pikal at precheza.cz
Tue Jan 4 09:06:22 CET 2011


r-help-bounces at r-project.org napsal dne 04.01.2011 05:21:25:

> Hi All
> I have these questions and request members expert view on this. 
> a) I have a dataframe (df) with five factors (identity variables) and 
> (measured value). The id variables are Year, Country, Commodity, 
> Unit. Value is a value for each combination of this.
> I would like to get just the unique combination of Commodity, Attribute 
> Unit. I just need the unique factor combination into a dataframe or a 
> I know aggregate and subset but dont how to use them in this context. 

aggregate(Value, list(Comoditiy, Atribute, Unit), function)

> b) Is it possible to inclue non- aggregate columns with aggregate 
> say in the above case > aggregate(Value ~ Commodity + Attribute, data = 
> FUN = count). The use of count(Value) is just a round about to return 
> combinations of Commodity & Attribute, and I would like to include 
> column in the returned data frame?

Hm. Maybe xtabs? But without any example it is only a guess.

> c) Is it possible to subset based on unique combination, some thing like
> this.
> > subset(df, unique(Commodity), select = c(Commodity, Attribute, Unit)). 
> know this is not correct as it returns an error 'subset needs a logical
> evaluation'. Trying various ways to accomplish the task. 

Probably sqldf package has tools for doing it but I do not use it so you 
have to try yourself.

df[Comodity==something, c("Commodity", "Attribute", "Unit")]

can be other way.

Anyway your explanation is ambiguous. Let say you have three rows with the 
same Commodity. Which row do you want to select?


> will be grateful for any ideas and help 
> Regards,
>    [[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list