[R] How to let R repeat computations over a number of variables
David Winsemius
dwinsemius at comcast.net
Sat Mar 15 01:02:26 CET 2008
Uli Kleinwechter <ulikleinwechter at yahoo.com.mx> wrote in
news:47DADC55.2070803 at yahoo.com.mx:
> Hello,
>
> I have written a small script to read a dataset, compute some basic
> descriptives and write them to a file (see below). The variable
> "maizeseedcash" of which the statistics are calculated is contained
> in the data frame agr_inputs. My question is whether there is a way
> to make R compute the statistics not only for maizeseedcash but also
> for other variables in the dataset. I thought about a thing like a
> loop which repeats the computations according to a set of variables
> which I would like to be able to specify before. Is something like
> that possible and if so, how would it look like?
>
> All hints are appreciated.
My hint would be to first look at ?summary or the describe function in
Hmisc package. My second hint would be to start referring to your R
objects by their correct names, in this case use "dataframe" instead
of dataset.
If summary and describe do not satisfy, then you could wrap your work
into a function, say func.summ and feed column arguments to it with:
apply(agr_inputs, 2, func.summ)
There are several areas where the code could be more compact. If you
let "probs" be a vector, you can get all of your quantiles at once:
> quantile(runif(100), probs=c(0.25, 0.5, 0.75))
25% 50% 75%
0.2240003 0.4919313 0.7359661
The names get carried forward when appended in a vector. See:
> test <- c(1,2, quantile(runif(100), probs=c(0.25, 0.5, 0.75)), 4,5)
> test
25% 50% 75%
1.0000000 2.0000000 0.2228890 0.4978050 0.8440893 4.0000000 5.0000000
And you can reference named elements by name with named indexing:
> test["25%"]
25%
0.2228890
Or use summary:
> summary(runif(100))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.003962 0.215400 0.441800 0.474600 0.735100 0.997600
> summary(runif(100))["Mean"]
Mean
0.4973
Best of luck;
David Winsemius
>
> ********
> sink("agr_inputs.txt", append=FALSE, type="output")
>
> agr_inputs<-read.csv2("agric_inputs.csv")
>
> attach(agr_inputs)
>
> min<-min(maizeseedcash)
>
> q25<-quantile(maizeseedcash, probs=.25)
>
> median<-quantile(maizeseedcash, probs=.50)
>
> mean<-mean(maizeseedcash)
>
> q75<-quantile(maizeseedcash, probs=.75)
>
> max<-max(maizeseedcash)
>
> var<-var(maizeseedcash)
>
> sd<-sd(maizeseedcash)
>
> varcoeff<-sd/mean*100
>
> Measure<-c("Min","25%", "Median", "Mean", "75%", "Max", "Var", "SD",
> "VarCoeff")
>
> maizeseedcas<-c( min, q25, median, mean, q75, max, var, sd,
> varcoeff)
>
> solution<-data.frame(Measure, maizeseedcas)
>
> print (solution)
>
> detach(agr_inputs)
>
> sink()
>
> ******
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list