# [R] Data frame manipulation - newbie question

Rense Nieuwenhuis rense.nieuwenhuis at gmail.com
Sun Jan 6 16:50:17 CET 2008

```Hi,

you may want to use that apply / tapply function. Some find it a bit
hard to grasp at first, but it will help you many times in many
situations when you get the hang of it.

Maybe you can get some information on my site: http://
www.rensenieuwenhuis.nl/r-project/manual/basics/tables/

Hope this helps,

Rense Nieuwenhuis

On Jan 3, 2008, at 11:53 , José Augusto M. de Andrade Junior wrote:

> Hi all,
>
> Could someone please explain how can i efficientily query a data frame
> with several factors, as shown below:
>
> ----------------------------------------------------------------------
> -----------------------------------
> Data frame: pt.knn
> ----------------------------------------------------------------------
> -----------------------------------
> row | k.idx   |   step.forwd  |  pt.num |   model |   prev  |  value
> |  abs.error
> 1      200        0                  1             lm          09
> 10.5       1.5
> 2      200        0                  2             lm          11
> 10.5       1.5
> 3      201        1                  1             lm          10
> 12          2.0
> 4      201        1                  2             lm          12
> 12          2.0
> 5      202        2                  1             lm          12
> 12.1       0.1
> 6      202        2                  2             lm          12
> 12.1       0.1
> 7      200        0                  1             rlm         10.1
> 10.5       0.4
> 8      200        0                  2             rlm         10.3
> 10.5       0.2
> 9      201        1                  1             rlm         11.6
> 12          0.4
> 10    201        1                  2             rlm         11.4
> 12          0.6
> 11    202        2                  1             rlm         11.8
> 12.1       0.1
> 12    202        2                  2             rlm         11.9
> 12.1       0.2
> ----------------------------------------------------------------------
> ------------------------------------
>
> k.idx, step.forwd, pt.num and model columns are FACTORS.
> prev, value, abs.error are numeric
>
> I need to take the mean value of the numeric columns  (prev, value and
> abs.error) for each k.idx and step.forwd and model. So: rows 1 and 2,
> 3 and 4, 5 and 6,7 and 8, 9 and 10, 11 and 12 must be grouped
> together.
>
> Next, i need to plot a boxplot of the mean(abs.error) of each model
> for each k.idx.
> I need to compare the abs.error of the two models for each step and
> the mean overall abs.error of each model. And so on.
>
> I read the manuals, but the examples there are too simple. I know how
> to do this manipulation in a "brute force" manner, but i wish to learn
> how to work the right way with R.
>
> Could someone help me?
>
> José Augusto
> University of São Paulo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help