[R] understanding patterns in categorical vs. continuous data

Thu Jan 26 20:48:55 CET 2006

You might prefer boxplot(insolation~veg_type) as a graphic.  That will 
give you quantiles.  To get the actual numeric values you could

for (i in levels(veg_type)) {
    print(i)
    quantile(insolation[veg_type==i])
}

see ?quantile for more help.

Dylan Beaudette wrote:
> Greetings,
> 
> I have a set of bivariate data: one variable (vegetation type) which is 
> categorical, and one (computed annual insolation) which is continuous. 
> Plotting veg_type ~ insolation produces a nice overview of the patterns that 
> I can see in the source data. However, due to the large number of samples 
> (1,000), and the apparent "spread" in the distribution of a single vegetation 
> type over a range of insolation values- I having a hard time quantitatively 
> describing the relationship between the two variables. 
> 
> Here is a link to a sample graph:
> http://casoilresource.lawr.ucdavis.edu/drupal/node/162
> 
> Since the data along each vegetation type "line" is not a distribution in the 
> traditional sense, I am having problems applying descriptive statistical 
> methods. Conceptually, I would like to some how describe the variation with 
> insolation, along each vegetation type "line".
> 
> Any guidance, or suggested reading material would be greatly appreciated.
> 
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
David W. Roberts                                     office 406-994-4548
Professor and Head                                      FAX 406-994-3190
Department of Ecology                         email droberts at montana.edu
Montana State University
Bozeman, MT 59717-3460