[R-sig-eco] Community distance matrix deconstruction

Sat Dec 13 17:24:39 CET 2014

Thanks Jari. As usual, you made this happen in a few short lines of code.

As for your second bit of advice (which made me laugh out loud), I will seriously consider if this is a mauvaise foi. You may be right.

Cheers,
Kate

Kate S. Boersma, Ph.D.
kateboersma at gmail.com
http://people.oregonstate.edu/~boersmak/

Department of Biology
University of San Diego
5998 Alcala Park
San Diego, CA 92110

> On Dec 12, 2014, at 10:10 PM, Jari Oksanen <jari.oksanen at oulu.fi> wrote:
> 
> Kate,
> 
> Your question really may need some clarification, but at the moment it looks to me that you want to have row indices and column indices for your dissimilarities, and information about within/between dissimilarities. If this is what you want to have, it is an easy task.
> 
> In the following I use a real data set from vegan to make this task a bit more general:
> 
> library(vegan)
> data(mite, mite.env)
> ## dissimilarities
> d <- dist(mite)
> ## row and column indices
> row <- as.dist(row(as.matrix(d)))
> col <- as.dist(col(as.matrix(d)))
> ## within same class: 1 = within, 0 = between
> within <- with(mite.env, as.dist(outer(Shrub, Shrub, "==")))
> ## data frame -- the pedestrian way: snappier alternatives possible 
> df = data.frame(row=as.vector(row), col=as.vector(col), within=as.vector(within), dist=as.vector(d))
> ## see it
> tail(df)
>> tail(df)
> #     row col within      dist
> #2410  68  67      1 691.69502
> #2411  69  67      0 716.93863
> #2412  70  67      0 700.60973
> #2413  69  68      0  78.08329
> #2414  70  68      0  24.24871
> #2415  70  69      1  67.86015
> 
> I don't think you really want to have this: you only believe that you want to have this (mauvaise foi, like they used to say).
> 
> If you only want to get summaries, check function meandist in vegan.
> 
> Cheers, Jari Oksanen
> 
> 
> 
>> On 13/12/2014, at 02:17 AM, Kate Boersma wrote:
>> 
>> Hi all.
>> 
>> I have a community analysis data manipulation puzzle for you... hopefully someone can help. Please let me know if this question needs clarification, has previously been answered, or would be better sent to a different list.
>> 
>> Details follow.
>> 
>> Thank you,
>> 
>> Kate
>> 
>> ---
>> 
>> Here is a simplified version of my problem:
>> 
>> I ran a community manipulation experiment with 7 reps of 2 treatments, for a total of 14 communities. Communities 1-7 are in Treatment 1 and 8-14 are in Treatment 2. I identified 5 taxa in the 14 communities and calculated a community dissimilarity matrix (14*14). Now I would like to decompose the distance matrix into a dataframe with the following column headings: community1s, community2s, withinORbetweenTRT, and distance. “Within or between treatment” indicates if the distance is between two communities within the same treatment or between the two treatments (values of 0 or 1).
>> 
>> I did it by hand below to demonstrate, but my actual dataset has 100 communities so I need to figure out how to automate it...
>> 
>> df<-data.frame(cbind(1:14, 18:5, 3:16, 14:1, 16:3)) #random values
>> 
>> dist<-dist(df)
>> 
>> distance<-as.vector(dist)
>> 
>> community1s<-c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,
>> 
>> 4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,
>> 
>> 8,8,8,8,8,8,9,9,9,9,9,10,10,10,10,11,11,11,12,12,13)
>> 
>> community2s<-c(2,3,4,5,6,7,8,9,10,11,12,13,14,3,4,5,6,7,8,9,10,11,12,13,14,
>> 
>> 4,5,6,7,8,9,10,11,12,13,14,5,6,7,8,9,10,11,12,13,14,
>> 
>> 6,7,8,9,10,11,12,13,14,7,8,9,10,11,12,13,14,
>> 
>> 8,9,10,11,12,13,14,9,10,11,12,13,14,10,11,12,13,14,
>> 
>> 11,12,13,14,12,13,14,13,14,14)
>> 
>> #now I need a column for whether or not the comparison is within treatment or
>> 
>> #between treatments. I ordered the sites by treatment so sites 1-7 are in treatment1
>> 
>> #and 8-14 are in treatment2. 0 is within and 1 is between.
>> 
>> withinORbetweenTRT<-c(0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,
>> 
>> 1,1,1,1,1,1,1,0,0,0,1,1,1,1,1,1,1,0,0,1,1,1,1,1,1,1,0,
>> 
>> 1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
>> 
>> 0,0)
>> 
>> #now I can assemble the dataframe:
>> 
>> final.df<-cbind(community1s, community2s, withinORbetweenTRT, distance)
>> 
>> final.df
>> 
>> I would appreciate any ideas!
>> 
>> -- 
>> Kate Boersma, PhD
>> Department of Biology
>> University of San Diego
>> 5998 Alcala Park
>> San Diego CA 92110
>> kateboersma at gmail.com
>> http://www.oregonstate.edu/~boersmak/
>> 
>> Kate S. Boersma, Ph.D.
>> kateboersma at gmail.com
>> http://people.oregonstate.edu/~boersmak/
>> 
>> Department of Biology
>> University of San Diego
>> 5998 Alcala Park
>> San Diego, CA 92110
>> 
>> 
>>    [[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 

	[[alternative HTML version deleted]]