[R] variables - data-structure
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Sat Dec 18 15:25:06 CET 2004
Helmut Kudrnovsky <hellik at web.de> writes:
> dear R-friends,
>
> i`ve got a large dataset of vegetation-samples with about 500
> variables(=species) in the following format:
>
> 1 spec1
> 1 spec23
> 1 spec54
> 1 spec63
> 2 spec1
> 2 spec2
> 2 spec253
> 2 spec300
> 2 spec423
> 3 spec20
> 3 spec88
> 3 spec121
> 3 spec200
> 3 spec450
> .
> .
>
> this means: sample 1 (grassland) with the species (=spec) 1, 23, 54, 63
>
> is it possible to get a following data-structure for further analysis?
>
> 1 2 3 ......
> spec1 1 1 0
> spec2 0 1 0
> spec3
> ...
> spec253 0 1 0
> ...
> spec450 0 0 1
>
> with thanks from the snowy tirol
> helli
Should be fairly easy. You could for instance generate a
table(species,area) - with a few complications if the same combination
can occur more than once. Or use matrix indexing
M <- matrix(0,nspec,narea)
M[cbind(species,area)] <- 1
Upon reading, the sort order of the species may be a little
problematic:
dd <- read.table(stdin())
0: 1 spec1
1: 1 spec23
2: 1 spec54
3: 1 spec63
4: 2 spec1
6: 2 spec2
7: 2 spec253
8: 2 spec300
9: 2 spec423
10: 3 spec20
11: 3 spec88
12: 3 spec121
13: 3 spec200
14: 3 spec450
15:
# ctrl-D terminates input
names(dd) <- c("area","species")
with(dd, table(species,area))
area
species 1 2 3
spec1 1 1 0
spec121 0 0 1
spec2 0 1 0
spec20 0 0 1
spec200 0 0 1
spec23 1 0 0
spec253 0 1 0
spec300 0 1 0
spec423 0 1 0
spec450 0 0 1
spec54 1 0 0
spec63 1 0 0
spec88 0 0 1
To fix up, use something like
specn <- paste("spec",
sort(as.numeric(substring(levels(dd$species),5))),
sep="")
dd <- transform(dd, species=factor(species,levels=specn))
with(dd, table(species,area))
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list