[R-sig-eco] problem using reshape package
Maria Dulce Subida
mdsubida at icman.csic.es
Thu Jul 9 17:39:25 CEST 2009
Thank you everyone.
The problem in using cast()in my original data frame came from duplicate
species within each replicate. That explains the "weird" aggregation it was
doing.
Problem solved.
Thanks again and apologies to those who also received my post from
manipulatr.
Best regards,
Dulce
-----Mensaje original-----
De: psolymos at gmail.com [mailto:psolymos at gmail.com] En nombre de Peter
Solymos
Enviado el: jueves, 09 de julio de 2009 17:25
Para: Maria Dulce Subida
CC: r-sig-ecology at r-project.org
Asunto: Re: [R-sig-eco] problem using reshape package
Hi,
something like this might help:
## your toy data set
d <- data.frame(SITE=rep(c("I","II"), each=3),
SPECIES=rep(LETTERS[1:3], 2),
Replicate1=c(1,2,1,4,1,6),
Replicate2=c(3,5,0,2,0,3),
Replicate3=c(0,1,2,0,0,3))
## required for functions inflate, stcs and mefa
library(mefa)
## this repeats SITE and SPECIES tags,
## and puts counts and replicate in a data frame
x <- data.frame(inflate(d[,1:2], rep(3, 6)),
Counts=array(t(d[,3:5])),
Replicate=rep(1:3, 6))
## something you wanted, except for NA's with xtabs (stats)
y1 <- xtabs(Counts ~ interaction(x$SITE, x$Replicate) + SPECIES, x)
## replicates cross tabulated separately
y2 <- xtabs(Counts ~ SITE + SPECIES + Replicate, x)
## same with mefa (will give you warnings
## due to some 'empty sample' misspecifications)
m <- mefa(stcs(x))
m$segm
Yours,
Peter
Peter Solymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")
On Thu, Jul 9, 2009 at 5:37 AM, Maria Dulce
Subida<mdsubida at icman.csic.es> wrote:
> Hello everyone!
>
> I'm having a problem in casting a data frame with the reshape package. I
> have an original data set of species abundances in replicate samples at
> certain sites, with the following form:
>
>
>
> SITE SPECIES Replicate1 Replicate2 Replicate3
>
> I A 1 3 0
>
> I B 2 5 1
>
> I C 1 0 2
>
> II A 4 2 0
>
> II C 1 0 0
>
> II D 6 3 3
>
>
>
> Please notice that site II does not have species B and has a new species
D,
> the remaining two are shared with site I.
>
>
>
> I need to get these data in the form of a matrix like:
>
>
>
> SITE.REPLICATE A B C D
>
> I.1 1 2 1 NA
>
> I.2 3 5 0 NA
>
> I.3 0 1 2 NA
>
> II.1 4 NA 1 6
>
> II.2 2 NA 0 3
>
> II.3 0 NA 0 3
>
>
>
> Using the above "toy data" in R, everything works fine using melt and
recast
> as follows (lets call test to may initial matrix):
>
>> testm <- melt (test, id.var=c("SITE","SPECIES"))
>
>> testc <- cast(testm, ...~SPECIES)
>
>> testc
>
> SITE variable A B C D
>
> 1 I Replicate1 1 2 1 NA
>
> 2 I Replicate2 3 5 0 NA
>
> 3 I Replicate3 0 1 2 NA
>
> 4 II Replicate1 4 NA 1 6
>
> 5 II Replicate2 2 NA 0 3
>
> 6 II Replicate3 0 NA 0 3
>
>
>
> However, when I use the same code in my real data set which is
considerably
> larger (73 sites and 7 replicates for site, resulting in a molten matrix
of
> 14469 x 4), the cast function does some kind of aggregation (in fact it
> advices of an aggregation using the default fun.aggregate) that I was not
> able to understand. I also tried to split my original data frame in order
to
> get molten matrices smaller than 5500x4, but I got the same problem.
>
> Could anyone help me with this?
>
>
>
> (I use R 2.8.1 for Windows)
>
>
>
> Thank you very very much in advance!
>
>
>
>
>
> Cheers,
>
>
>
> Dulce
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
More information about the R-sig-ecology
mailing list