[R] indicator value in labdsv

Mon Sep 19 18:38:35 CEST 2005

Wow!  That was fast!

     Unfortunately, Agnieszka, I don't think you will find an objective 
criterion for this.  Clearly, species which do not have a statistically 
significant value are probably less useful, but of the many that are 
significant, many may be marginal.

     Without knowing fully what you are hoping to achieve, I think I 
would rank the species by indicator value, and establish the highest 
threshold for indicator value that gives you a suitable number of 
species for each type.  That way, if you are looking to write a field 
key, for example, you would have sufficient values to identify every 
type I suspect.

Good luck, Dave

astrzelczak at ps.pl wrote:
> Hello,
> 
> I was uclear before, I'm sory about it. I forgot to add that I'm using duleg...
> 
> I used mvpart for multivariate regression trees. My input variables are
> environmental parameters, output variables are macrophyte species
> (presence=1,absence=0 in conecutive cases=lakes). For obtained classes I used
> duleg to find indicator species for every class. I checked the article Dufrene,
> M. and Legendre, P. 1997. Species assemblages and indicator species: the need
> for a flexible asymmetrical approach. Ecol. Monogr. 67(3):345-366. The authors
> used the threshold of indval=0.25(25%) and that's the only hint I've found in
> the literature. This threshod seems to reasonable, but still I have impression
> that's too low...
> 
> best regards
> Agnieszka
> 
> 
> 
>>Agnieszka,
> 
> 
>>     As Jari indicated, it depends on which function you meant in you
>>inquiry.  The duleg() function implements the Dufrene-Legendre
>>algorithm, where "indicator" species are indicative of a priori
>>communities.  It this requires a classification, and is biased to find
>>species which occur in the dataset approximately as often as the mean
>>cluster size.
> 
> 
>>     The indpsc() function calculates the mean similarity of all samples
>>a species occurs in.  This is slightly biased because  we know that the
>>samples being used to calculate the mean share at least the species that
>>defines them, but it is still possible to compare those values to the
>>mean similarity of the whole matrix, or to an expectation of maximum
>>similarity.  Obviously, as species occur more frequently, the harder it
>>is to have a really high similarity (indicator value), with the extreme
>>case that a species that occurs in every sample must have the same value
>>as the mean of the whole matrix.
> 
> 
>>     To tell the truth, I forgot that indspc() was included in the
>>current version of labdsv.  In the new version (due to be released any
>>day), I have included a permutation test that estimates quantiles of
>>expected values for different numbers of occurrences.  It works, but is
>>pretty slow.  Jari has created a version that uses parametric statistics
>>to estimate the same envelope, but I haven't had a chance to try it yet.
> 
> 
>>     What research are you doing, and what are you really trying to
>>determine?  Perhaps something altogether different will work better.
> 
> 
>>Thanks, Dave Roberts
> 
> 
>>>On Mon, 2005-09-19 at 09:41 +0200, astrzelczak at ps.pl wrote:
>>>
>>>
>>>>Hi,
>>>>
>>>>I'm trying to find out what threshold of indicator value in labadsv should be
>>>>used to accept a specie as an indicator one? So far I assumed that indval=0.5
>>>>is high enough to avoid any mistakes but it was based only in my intuition.
>>>>
>>>>I'd be greatful for any advise
>>>>
>>>>best regards
>>>>
>>>
>>>
>>>Agnieszka,
>>>
>>>R mailing list software appends the following to your message:
>>>
>>>
>>>
>>>>PLEASE do read the posting guide!
>>>>http://www.R-project.org/posting-guide.html
>>>
>>>
>>>Then about indicator value analysis. You should be more specific: there
>>>seem to be three alternatives functions for "indicator species" in
>>>labdsv. Which did  you mean? At least two of these return an item called
>>>"indval", and these two alternative "indvals" are very different. For
>>>the Dufręne-Legendre indvals, you should check the original paper (see
>>>references in the help page), and there you even have an associated "P
>>>value". In indspc, the variance of the indval clearly is dependent on
>>>species frequency. Moreover, in indspc the expected indval (and its
>>>variance) are dependent on the whole set of sites you have: these
>>>reflect the general "homogeneity" of your data set. Therefore you cannot
>>>say there that any certain value would mean that a species is a good
>>>indicator. However, it would be easy to work out standard errors for
>>>indspc indvals.
>>>
>>>I think it would be more useful to post to some other mailing group
>>>where people are more concerned about indicator species, or to contact
>>>the package author directly (I CC this message to him).
>>>
>>>cheers, jari oksanen
> 
> 
> 
> 
> 
> 
> --
> Best regards,
>                                                       mailto:astrzelczak at ps.pl
> 
> Agnieszka Strzelczak, Research Assistant
> mailto:astrzelczak at ps.pl
> 
> Institute of Chemistry and Environmental Protection
> Faculty of Chemical Engineering
> Szczecin University of Technology
> Aleja Piastow 42
> 71-065 Szczecin
> Poland
> 
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
David W. Roberts                                     office 406-994-4548
Professor and Head                                      FAX 406-994-3190
Department of Ecology                         email droberts at montana.edu
Montana State University
Bozeman, MT 59717-3460