[R] Descriptive Stats from Data Frame

Simon Zehnder simon.zehnder at googlemail.com
Tue Aug 30 23:22:35 CEST 2011


Hi Rich,

I do not know what u really want, because it seems to me, u want to calculate the mean of all rows, where the chemical is Arsenic??

But try this to get a little more inside:

mean(chemdata$quant[chemdata$param=="Arsenic"])

The vector chemdata[chemdata$param=="Arsenic",] is a logical vector, returning TRUE for every row in which the variable param takes the value "Arsenic". Try it in your R editor to see it and understand the R concept!

If u now want to get all values of a certain column, given all values have "Arsenic" as param, u just write:

chemdata$COLUMNNAME[chemdata$param=="Arsenic"]

I do not know if this helps, as it seems to me, that Arsenic only occurs once in your frame…..

Good luck Simon



On Aug 30, 2011, at 11:00 PM, Rich Shepard wrote:

>  I don't find how to do what I need to do in Dalgaard or 'R Cookbook', so
> I'm asking here.
> 
>  I have a data frame with water chemistry data and I want to start
> exploring these data. There are three factors (site, date, chemical)
> associated with each measurement. The data frame looks like this:
> 
>> summary(chemdata)
>                             site_id.sample_date.param.quant
> BC-0.5|1996-04-19|Arsenic|0.01              :    1
> BC-0.5|1996-04-19|Calcium|76.56             :    1
> BC-0.5|1996-04-19|Chloride|12               :    1
> BC-0.5|1996-04-19|Magnesium|43.23           :    1
> BC-0.5|1996-04-19|Sulfate|175               :    1
> BC-0.5|1996-04-19|Total Dissolved Solids|460:    1
> (Other)                                     :14880
> 
>  I want first to calculate (and plot) descriptive stats by chemical,
> ignoring site and date and telling R to ignore missing data. (Incorporating
> those factors will occur later.) What I have not been able to figure out is
> how to specify the command to, for example, calculate mean and sd for
> Arsenic. My floundering and thrashing includes attempts like these:
> 
>> mean(chemdata.param="Arsenic")
> Error in is.numeric(x) : 'x' is missing
>> mean(chemdata.quant, param="Arsenic")
> Error in mean(chemdata.quant, param = "Arsenic") :
>  object 'chemdata.quant' not found
>> mean(chemdata$quant, param="Arsenic")
> [1] NA
> Warning message:
> In mean.default(chemdata$quant, param = "Arsenic") :
>  argument is not numeric or logical: returning NA
> 
>  As a newcomer to R I've done a lot of reading, yet all the examples use
> nicely structured data to illustrate the point being made. I need to work
> with my data and learn how to specify columns and write correct commands for
> the analyses I need. Please point me in the right direction.
> 
> Rich
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list