[R] how to combine data of several csv-files
8rino-Luca Pantani
ottorino-luca.pantani at unifi.it
Mon Jul 30 17:24:17 CEST 2007
Ok. I missed the grouping factor
Try this.
You can modify "my factor" to fit your needs.
As to avoid list, I cannot help, sorry
I use them only when I have to collect different classes of objects.
v1 <- NA
v2 <- rnorm(6)
v3 <- rnorm(6)
v4 <- rnorm(6)
v5 <- rnorm(6)
v6 <- rnorm(6)
v7 <- rnorm(6)
v8 <- rnorm(6)
v8 <- NA
df.my <- cbind.data.frame(v1, v2, v3, v4, v5, v6, v7, v8)
(df.my2 <- reshape(df.my,
varying=list(c("v1","v2","v3", "v4","v5","v6","v7","v8")),
idvar="sequential",
timevar="cat",
direction="long"
))
my.factor <- factor(
ifelse(is.na(df.my2$v1), "not.considered",
ifelse(df.my2$cat %in% 2:4, "cat1", "cat2")
)
)
df.my3 <- cbind(df.my2, Correct.Cat =my.factor)
aggregate(df.my2$v1, by=list(category=df.my3$Correct.Cat), mean)
aggregate(df.my2$v1, by=list(category=df.my3$Correct.Cat),
function(x){sd(x, na.rm = TRUE)})
Antje ha scritto:
> Hello,
>
> thank you for your help. But I guess, it's still not what I want...
> printing df.my gives me
>
> df.my
> v1 v2 v3 v4 v5
> v6 v7 v8
> 1 NA -0.6442149 0.02354036 -1.40362589 -1.1829260 1.17099178
> -0.046778203 NA
> 2 NA -0.2047012 -1.36186952 0.13045724 2.1411553 0.49248118
> -0.233788840 NA
> 3 NA -1.1986041 -0.42197792 -0.84651458 -0.1327081 -0.18690065
> 0.443908897 NA
> 4 NA -0.2097442 1.50445971 1.57005071 -0.1053442 1.50050976
> -1.649740180 NA
> 5 NA -0.7343465 -1.76763996 0.06961015 -0.8179396 -0.65552410
> 0.003991354 NA
> 6 NA -1.3888750 0.53722404 0.25269771 -1.2342698 -0.01243247
> -0.228020092 NA
>
> now, I have to combine like this:
>
> v1 v2 v3 v4 v5
> v6 v7 v8
> NA cat1 cat1 cat1 cat2 cat2
> cat2 NA
>
> -->
>
> mean(df.my$v2[1],df.my$v3[1],df.my$v4[1])
> mean(df.my$v2[2],df.my$v3[2],df.my$v4[2])
> mean(df.my$v2[3],df.my$v3[3],df.my$v4[3])
> mean(df.my$v2[4],df.my$v3[4],df.my$v4[4])
> mean(df.my$v2[5],df.my$v3[5],df.my$v4[5])
> mean(df.my$v2[6],df.my$v3[6],df.my$v4[6])
>
> the same for v5, v6 and v7
>
> further, I'm not sure how to avoid the list, because this is the
> result of the processing I did before...
>
> Ciao,
> Antje
>
>
> 8rino-Luca Pantani schrieb:
>> I hope I see.
>>
>> Why not try the following, and avoid lists, which I'm not still able
>> to manage properly ;-)
>> v1 <- NA
>> v2 <- rnorm(6)
>> v3 <- rnorm(6)
>> v4 <- rnorm(6)
>> v5 <- rnorm(6)
>> v6 <- rnorm(6)
>> v7 <- rnorm(6)
>> v8 <- rnorm(6)
>> v8 <- NA
>> (df.my <- cbind.data.frame(v1, v2, v3, v4, v5, v6, v7, v8))
>> (df.my2 <- reshape(df.my,
>> varying=list(c("v1","v2","v3",
>> "v4","v5","v6","v7","v8")),
>> idvar="sequential",
>> timevar="cat",
>> direction="long"
>> ))
>> aggregate(df.my2$v1, by=list(category=df.my2$cat), mean)
>> aggregate(df.my2$v1, by=list(category=df.my2$cat), function(x){sd(x,
>> na.rm = TRUE)})
>>
>>
>> Antje ha scritto:
>>> okay, I played a bit around and now I have some kind of testcase for
>>> you:
>>>
>>> v1 <- NA
>>> v2 <- rnorm(6)
>>> v3 <- rnorm(6)
>>> v4 <- rnorm(6)
>>> v5 <- rnorm(6)
>>> v6 <- rnorm(6)
>>> v7 <- rnorm(6)
>>> v8 <- rnorm(6)
>>> v8 <- NA
>>>
>>> list <- list(v1,v2,v3,v4,v5,v6,v7,v8)
>>> categ <- c(NA,"cat1","cat1","cat1","cat2","cat2","cat2",NA)
>>>
>>> > list
>>> [[1]]
>>> [1] NA
>>>
>>> [[2]]
>>> [1] -0.6442149 -0.2047012 -1.1986041 -0.2097442 -0.7343465 -1.3888750
>>>
>>> [[3]]
>>> [1] 0.02354036 -1.36186952 -0.42197792 1.50445971 -1.76763996
>>> 0.53722404
>>>
>>> [[4]]
>>> [1] -1.40362589 0.13045724 -0.84651458 1.57005071 0.06961015
>>> 0.25269771
>>>
>>> [[5]]
>>> [1] -1.1829260 2.1411553 -0.1327081 -0.1053442 -0.8179396 -1.2342698
>>>
>>> [[6]]
>>> [1] 1.17099178 0.49248118 -0.18690065 1.50050976 -0.65552410
>>> -0.01243247
>>>
>>> [[7]]
>>> [1] -0.046778203 -0.233788840 0.443908897 -1.649740180 0.003991354
>>> -0.228020092
>>>
>>> [[8]]
>>> [1] NA
>>>
>>> now, I need the means (and sd) of element 1 of
>>> list[2],list[3],list[4] (because they belong to "cat1") and
>>>
>>> = mean(-0.6442149, 0.02354036, -1.40362589)
>>>
>>> the same for element 2 up to element 6 (--> I would the get a vector
>>> containing the means for "cat1")
>>> the same for the vectors belonging to "cat2".
>>>
>>> does anybody now understand what I mean?
>>>
>>> Antje
>>>
>>>
>>>
>>
>
>
--
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273
OLPantani at unifi.it
More information about the R-help
mailing list