[R] Descriptive Stats from Data Frame

Rich Shepard rshepard at appl-ecosys.com
Tue Aug 30 23:00:13 CEST 2011


   I don't find how to do what I need to do in Dalgaard or 'R Cookbook', so
I'm asking here.

   I have a data frame with water chemistry data and I want to start
exploring these data. There are three factors (site, date, chemical)
associated with each measurement. The data frame looks like this:

> summary(chemdata)
                              site_id.sample_date.param.quant
  BC-0.5|1996-04-19|Arsenic|0.01              :    1
  BC-0.5|1996-04-19|Calcium|76.56             :    1
  BC-0.5|1996-04-19|Chloride|12               :    1
  BC-0.5|1996-04-19|Magnesium|43.23           :    1
  BC-0.5|1996-04-19|Sulfate|175               :    1
  BC-0.5|1996-04-19|Total Dissolved Solids|460:    1
  (Other)                                     :14880

   I want first to calculate (and plot) descriptive stats by chemical,
ignoring site and date and telling R to ignore missing data. (Incorporating
those factors will occur later.) What I have not been able to figure out is
how to specify the command to, for example, calculate mean and sd for
Arsenic. My floundering and thrashing includes attempts like these:

> mean(chemdata.param="Arsenic")
Error in is.numeric(x) : 'x' is missing
> mean(chemdata.quant, param="Arsenic")
Error in mean(chemdata.quant, param = "Arsenic") :
   object 'chemdata.quant' not found
> mean(chemdata$quant, param="Arsenic")
[1] NA
Warning message:
In mean.default(chemdata$quant, param = "Arsenic") :
   argument is not numeric or logical: returning NA

   As a newcomer to R I've done a lot of reading, yet all the examples use
nicely structured data to illustrate the point being made. I need to work
with my data and learn how to specify columns and write correct commands for
the analyses I need. Please point me in the right direction.

Rich



More information about the R-help mailing list