[R] transpose dataset to PC-ORD?
Dave Roberts
droberts at montana.edu
Tue May 23 23:46:13 CEST 2006
Daniel,
I can help somewhat I think. PC-ORD also allows data input in what
it calls "database" format, where each row is
sample, taxon, abundance
There as many rows/sample as there are non-zero species, and only three
columns. To get your taxon data.frame (currently samples as rows,
species as columns, called data in your example) in that format try
dematrify(data,file='whatever.csv')
with the function pasted below (watch out for email-altered line
breaks). That will create a CSV file you can import into PC-ORD.
Just to encourage you a little, you really should try the Ecology
packages in R. See packages vegan, ade-4, and labdsv, for example, and
take a look at
http://ecology.msu.montana.edu/labdsv/R
Dave R.
*********************************************************************
dematrify <- function (df,filename=NULL,sep=",")
{
tmp <- which(df>0,arr.ind=TRUE)
stack <- NULL
samples <- row.names(tmp)
taxon <- names(df)[tmp[,2]]
abund <- rep(NA,nrow(tmp))
for (i in 1:nrow(tmp)) {
abund[i] <- df[samples[i],taxon[i]]
stack <-
rbind(stack,paste(samples[i],sep,taxon[i],sep,abund[i],"\n",sep=""))
}
if (is.null(filename)) {
tmp2 <- cbind(samples,taxon,abund)
tmp2 <- data.frame(tmp2[order(tmp2[,1]),])
return(tmp2)
}
else {
stack <- sort(stack)
sink(file=filename)
cat(stack)
sink()
}
}
Daniel Gruner wrote:
> Hello:
>
> I need to take a species-sample matrix and transpose it to the format
> used by PC-ORD for analysis. Unfortunately, the number of species is
> very large (>5000), and so this operation cannot be performed simply
> in an application like Excel, which has a 255 column limit. So, I
> wrote relatively simple code in R that I hoped would do this
> (appended below). But there are glitches.
>
> The format needed for PC-ORD (where "NA" shows an empty cell):
>
> NA,3,sites,NA
> NA,3,species,NA
> NA,Q,Q,Q
> NA,sp1,sp2,sp3
> site1,1,0,0
> site2,0,1,2
> site3,0,3,0
>
> 2 cells in first row indicate number of samples (rows), the second
> column indicates number of species (columns), the third row indicates
> variable type (Q = quantitative), and the fourth row shows column
> headers (species names). So, one can create a transposable matrix in
> a spreadsheet where 5000+ species are the rows:
>
> NA,NA,NA,NA,site1,site2,site3
> 3,3,Q,sp1,1,0,0
> sites,species,Q,sp2,0,1,3
> NA,NA,Q,sp3,0,2,0
>
>
> It is important that the data file written out is totally clean and
> ready to go for PC-ORD, because I cannot open and edit it in a
> spreadsheet. However, the code performs the transpose operation and
> writes the file, but now the former row IDs are the first row in the
> new file (NA,1,2,3), and the 4 leading spaces are "X, X.1, X.2,
> X.3". I'd like to delete the first row and delete the first 4 values
> of column1, without deleting the column.
>
> NA,1,2,3
> X,3,islands,NA
> X.1,3,speciesNA
> X.2,Q,Q,Q
> X.3,sp1,sp2,sp3
> site1,1,0,0
> site2,0,1,2
> site3,0,3,0
>
> I have tried various tricks that I will not list/belabor here
> (various col.names, row.names, header, Extract, etc commands). Any
> further hints on code that will either stop R from adding these, or
> strip them at the end?
>
> (PS, yes, I can learn how to my multivariate analyses in R and skip
> PC-ORD, but I am time limited on this one, and it seems that this
> code could be very useful in numerous ways)
>
> Many thanks for the help,
> Dan Gruner
> (Windows XP, R vers2.2)
>
>
>
> ##transpose datasets to convert to PC-ORD format
>
> data<-read.csv("data.csv", header=TRUE, as.is=T,
> strip.white=T, na.strings="NA")
> data<-as.matrix(data)
> data.trans <- t(data)
> write.csv(data.trans, file = "datatransp.csv",
> quote = F, na = "")
>
>
>
> *******************************
>
> Daniel S. Gruner, Postdoctoral Scholar
> Bodega Marine Lab, University of California -- Davis
> PO Box 247, 2099 Westside Rd
> Bodega Bay, CA 94923-0247
> (o) 707.875.2022 (f) 707.875.2009 (m) 707.338.5722
> email: dsgruner_at_ucdavis.edu
> http://www.bml.ucdavis.edu/facresearch/gruner.html
> http://www.hawaii.edu/ant/
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
David W. Roberts office 406-994-4548
Professor and Head FAX 406-994-3190
Department of Ecology email droberts at montana.edu
Montana State University
Bozeman, MT 59717-3460
More information about the R-help
mailing list