[R] grouping data by a portion of the row name
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Fri Sep 14 00:47:46 CEST 2007
Bricklemyer, Ross S wrote:
> I am attempting to write a routine where I can run PAM (partition around mediods) on a dataset containing multiple soil cores and PCA spectral data from several depths per core. I want to run PAM on each individual core, so I need to group the data by core to run the analysis. Below is an example of my data structure:
>
> Lab.id PC1 PC2 PC3
> MAT057.2.5 2.438454966 -1.011182986 -3.040881377
> MAT057.7.5 10.69120648 4.767694892 -1.719466898
> MAT057.12.5 8.215852171 4.645793327 0.974020242
> MAT057.17.5 10.00422215 3.516213164 2.586742695
> MAT057.22.5 18.49165113 5.143031557 0.472636009
> MAT057.27.5 18.31255522 4.255319595 0.802902692
> MAT057.35 11.75818601 -0.325388031 3.445673092
> MAT057.45 6.043984786 -3.297325975 3.075221644
>
> The MAT057 is the core code and the values following the period refer to the sampling depths. There are many cores in the dataset and I want to automate the analysis so that it will grab data with the same core code and run PAM. Any ideas on what the R code would look like for that?
>
> Ross
>
Looks like these aren't really row names but a variable called Lab.id.
Look into things like
sub("\\..*$", "", Lab.id)
or maybe
sapply(strsplit(Lab.id, "\\."), "[[", 1)
(if Lab.id is a factor, you need first to transform using as.character
in the 2nd version)
> *******************************************************************
> Ross Bricklemyer
> Dept. of Crop and Soil Sciences
> Washington State University
> 291D Johnson Hall
> PO Box 646420
> Pullman, WA 99164-6420
> Work: 509.335.3661
> Cell/Home: 406.570.8576
> Fax: 509.335.8674
> Email: rsb at wsu.edu
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list