[R] Odp: Data frame manipulation - newbie question

Petr PIKAL petr.pikal at precheza.cz
Thu Jan 3 12:16:45 CET 2008


Hi

r-help-bounces at r-project.org napsal dne 03.01.2008 11:53:38:

> Hi all,
> 
> Could someone please explain how can i efficientily query a data frame
> with several factors, as shown below:
> 
> 
---------------------------------------------------------------------------------------------------------
> Data frame: pt.knn
> 
---------------------------------------------------------------------------------------------------------
> row | k.idx   |   step.forwd  |  pt.num |   model |   prev  |  value
> |  abs.error
> 1      200        0                  1             lm          09
> 10.5       1.5
> 2      200        0                  2             lm          11
> 10.5       1.5
> 3      201        1                  1             lm          10
> 12          2.0
> 4      201        1                  2             lm          12
> 12          2.0
> 5      202        2                  1             lm          12
> 12.1       0.1
> 6      202        2                  2             lm          12
> 12.1       0.1
> 7      200        0                  1             rlm         10.1
> 10.5       0.4
> 8      200        0                  2             rlm         10.3
> 10.5       0.2
> 9      201        1                  1             rlm         11.6
> 12          0.4
> 10    201        1                  2             rlm         11.4
> 12          0.6
> 11    202        2                  1             rlm         11.8
> 12.1       0.1
> 12    202        2                  2             rlm         11.9
> 12.1       0.2
> 
----------------------------------------------------------------------------------------------------------
> 
> k.idx, step.forwd, pt.num and model columns are FACTORS.
> prev, value, abs.error are numeric
> 
> I need to take the mean value of the numeric columns  (prev, value and
> abs.error) for each k.idx and step.forwd and model. So: rows 1 and 2,
> 3 and 4, 5 and 6,7 and 8, 9 and 10, 11 and 12 must be grouped
> together.

aggregate(numeric.columns, list(factors), mean)

> 
> Next, i need to plot a boxplot of the mean(abs.error) of each model
> for each k.idx.

Maybe

boxplot(split(abs.error, interaction(k.idx, model)))

Regards
Petr


> I need to compare the abs.error of the two models for each step and
> the mean overall abs.error of each model. And so on.
> 
> I read the manuals, but the examples there are too simple. I know how
> to do this manipulation in a "brute force" manner, but i wish to learn
> how to work the right way with R.
> 
> Could someone help me?
> Thanks in advance.
> 
> José Augusto
> Undergraduate student
> University of São Paulo
> Business Administration Faculty
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list