[R] separation depending on equal contents in more than one field
Florian Jansen
jansen at uni-greifswald.de
Mon Oct 2 17:30:32 CEST 2006
Hi,
I have a dataframe:
(obs <- data.frame(a=c(1,2,2,3,3,3), b=c(1,2,3,4,4,5), c=1:2))
attach(obs)
In reality its about 1 million rows.
Some of the datasets have same contents in col a and! b like row 4 and 5.
I want to do some calculations on col c within the duplicated rows and
merge them afterwards:
layer <- function(x) round((1-prod(1-x/100))*100,0)
(covnew <- aggregate(c, list(a=a, b=b), layer))
This works fine, but not with 1 mill. rows because of memory space
limitations.
So I thought to split the dataframe into the majority of unique rows on
one hand and all duplicated rows on the other:
With
subset(obs, a %in% a[duplicated(a)])
and !a respectively this works fine for single column comparison.
This must be also possible for two column comparison, but I can`t get it.
Thanks
Florian
--
Dr. Florian Jansen
Geobotany & Nature Conservation
Institute for Botany and Landscape Ecology
Ernst-Moritz-Arndt-University
Grimmer Str. 88
17487 Greifswald - Germany
+49 (0)3834 86 4147
More information about the R-help
mailing list