[Rd] subcripts on data frames (PR#9885)
Tony Plate
tplate at acm.org
Tue Aug 28 17:44:26 CEST 2007
The line
worms[rev(order(Worm.density)),] [!duplicated(Vegetation),]
looks suspect to me -- it looks like you are first creating an sorted
version of the dataframe 'worms', and then subsetting it based on values
of 'Vegetation' in the original order. When reordering dataframes I
would avoid 'attaching' them and I would break the expression into two
separate expressions, so to be sure the subsetting is referring to the
appropriate values:
> worms <-
read.table("http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/worms.txt",
header=T)
> worms2 <- worms[rev(order(worms$Worm.density)), ]
> worms2[!duplicated(worms2$Vegetation), ]
Field.Name Area Slope Vegetation Soil.pH Damp Worm.density
9 The.Orchard 1.9 0 Orchard 5.7 FALSE 9
16 Water.Meadow 3.9 0 Meadow 4.9 TRUE 8
11 Garden.Wood 2.9 10 Scrub 5.2 FALSE 8
10 Rookery.Slope 1.5 4 Grassland 5.0 TRUE 7
2 Silwood.Bottom 5.1 2 Arable 5.2 FALSE 7
>
Here's a one-liner involving 'with' and 'subset':
> subset(worms[rev(order(worms$Worm.density)), ], !duplicated(Vegetation))
Field.Name Area Slope Vegetation Soil.pH Damp Worm.density
9 The.Orchard 1.9 0 Orchard 5.7 FALSE 9
16 Water.Meadow 3.9 0 Meadow 4.9 TRUE 8
11 Garden.Wood 2.9 10 Scrub 5.2 FALSE 8
10 Rookery.Slope 1.5 4 Grassland 5.0 TRUE 7
2 Silwood.Bottom 5.1 2 Arable 5.2 FALSE 7
>
-- Tony Plate
m.crawley at imperial.ac.uk wrote:
> I'm not sure if this is a bug, or if I'm doing something wrong.
> =20
> =46rom the worms dataframe, which is at in a file called worms.txt at
> =20
> http://www.imperial.ac.uk/bio/research/crawley/therbook
> <http://www.imperial.ac.uk/bio/research/mjcraw/therbook/index.htm>=20
>
> =20
> the idea is to extract a subset of the rows, sorted in declining order
> of worm density, with only the maximum worm density from each vegetation
> type:
> =20
>
> worms<-read.table("c:\\temp\\worms.txt",header=3DT)
> attach(worms)
> names(worms)
>
> [1] "Field.Name" "Area" "Slope" "Vegetation"
> "Soil.pH"=20=20=20=20=20
> [6] "Damp" "Worm.density"
>
> =20
> Usinng "not duplicated" I get two rows for Meadow and none for Scrub
> =20
> worms[rev(order(Worm.density)),] [!duplicated(Vegetation),]
>
> Field.Name Area Slope Vegetation Soil.pH Damp Worm.density
> 9 The.Orchard 1.9 0 Orchard 5.7 FALSE 9
> 16 Water.Meadow 3.9 0 Meadow 4.9 TRUE 8
> 10 Rookery.Slope 1.5 4 Grassland 5.0 TRUE 7
> 2 Silwood.Bottom 5.1 2 Arable 5.2 FALSE 7
> 4 Rush.Meadow 2.4 5 Meadow 4.9 TRUE 5
>
> and here is the correct set of rows, but in the wrong order, using
> unique
> =20
> worms[rev(order(Worm.density)),] [unique(Vegetation),]
>
> Field.Name Area Slope Vegetation Soil.pH Damp Worm.density
> 16 Water.Meadow 3.9 0 Meadow 4.9 TRUE 8
> 9 The.Orchard 1.9 0 Orchard 5.7 FALSE 9
> 11 Garden.Wood 2.9 10 Scrub 5.2 FALSE 8
> 2 Silwood.Bottom 5.1 2 Arable 5.2 FALSE 7
> 10 Rookery.Slope 1.5 4 Grassland 5.0 TRUE 7
>
> =20
> Best wishes,
> =20
> Mick
> =20
> Prof M.J. Crawley FRS
> =20
> Imperial College London
> Silwood Park
> Ascot
> Berks
> SL5 7PY
> UK
> =20
> Phone (0) 207 5942 216
> Fax (0) 207 5942 339
> =20
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
More information about the R-devel
mailing list