[R] means in tables

arun smartpink111 at yahoo.com
Thu Apr 11 05:45:14 CEST 2013


Hi Silvano,

Just to add:

If you need to use the first option:
set.seed(25)
 lst1<-lapply(1:100,function(i) as.data.frame(matrix(sample(1:100,8000*3,replace=TRUE),ncol=3)))
system.time(res1<-eval(parse(text=paste(paste("lst1","[[",seq_along(lst1),"]]",sep=""),collapse="+")))/length(lst1))

# user  system elapsed 
 # 0.060   0.000   0.063 #faster


system.time(res<-apply(abind(lst1,along=3),c(1,2),mean))
#   user  system elapsed 
 # 0.356   0.024   0.380 

  res<- as.data.frame(res)
identical(res,res1)
#[1] TRUE
A.K.

----- Original Message -----
From: Silvano Cesar da Costa <silvano at uel.br>
To: arun <smartpink111 at yahoo.com>
Cc: 
Sent: Wednesday, April 10, 2013 10:34 PM
Subject: Re: [R] means in tables

Arun,

this code work very well:

setwd('c:/Dados/')
list.files(pattern=".txt")

lst2 = lapply(list.files(pattern=".txt"), function(x) read.table(x,
sep="", header=F))

library(abind)
(medias = apply(abind(lst2, along=3), c(1,2), mean))

I want to thank you for your precious help. Without it I couldn't do it.

Thanks alot,



> HI Silvano,
>
> Can you send the output of 
>  dput(lapply(lst1,function(x) head(x))[1:3])
>
>
> ?
>
> It seems strange that you were able to read but can't perform the task.
> Arun
>
>
>
> ----- Original Message -----
> From: Silvano Cesar da Costa <silvano at uel.br>
> To: arun <smartpink111 at yahoo.com>
> Cc:
> Sent: Wednesday, April 10, 2013 9:38 PM
> Subject: Re: [R] means in tables
>
> I performed all the procedures you described.
>
> I read each file with the command:
> lst2[[1]]; lst2[[2]]; lst2[[3]];
>
> I don't had problems with this.
>
> I have 100 files with 8000 rows and 3 columns, each one.
>
> I took the test with only 3 tables.
>
>
>
>
>
>
>> In that case, I would first check with:
>> list.files()
>>
>> I used list.files(pattern=.txt) because some files in the directory were
>> not .txt.
>>
>> First you try with 3 files in a new directory and use:
>> list.files()
>> Then,
>> lapply(list.files(), function(x) read.table(x,sep="",header=TRUE))
>> In the above line, there are chances for your delimiter to be
>> different. 
>> It could be sep="\t", or sep="," etc.
>>
>> Also, try to read files one by one from the directory and see whether
>> each
>> of the files have the same delimiter etc.
>>
>> Please post what you got the above steps.
>> BTW, how big are those files?
>>
>>
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Silvano Cesar da Costa <silvano at uel.br>
>> To: arun <smartpink111 at yahoo.com>
>> Cc:
>> Sent: Wednesday, April 10, 2013 9:17 PM
>> Subject: Re: [R] means in tables
>>
>> Sorry Arun,
>>
>> is correct. The message was:
>>
>> Error in abind(lst2, along = 3) : object 'lst2'  not available
>>
>>
>>
>>
>>
>>> Hi,
>>> I just wonder whether you were able to read the 3 tables correctly.  I
>>> am
>>> not able to correctly translate your error.  Does it mean that Object
>>> ls2
>>> not available...?
>>>
>>>
>>> Could you show str(lst2)
>>> Arun
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Silvano Cesar da Costa <silvano at uel.br>
>>> To: arun <smartpink111 at yahoo.com>
>>> Cc:
>>> Sent: Wednesday, April 10, 2013 7:18 PM
>>> Subject: Re: [R] means in tables
>>>
>>> Thanks Arun.
>>>
>>> It was very cool. I did not know these commands.
>>>
>>> I applied it for 3 tables, but it doesn't work:
>>>
>>> setwd('/home/silvano/Dados/')
>>> list.files(pattern=".txt")
>>>
>>> lst2 <- lapply(list.files(pattern=".txt"), function(x) read.table(x,
>>> sep="", header=TRUE))
>>>
>>> library(abind)
>>> apply(abind(lst2, along=3), c(1, 2), mean)
>>>
>>>> apply(abind(lst2, along=3), c(1, 2), mean)
>>> Erro em abind(lst2, along = 3) : objeto 'lst2' não encontrado
>>>
>>> I don't know why.
>>>
>>>
>>>
>>>
>>>>
>>>> Hi,
>>>> For loading number of datasets, you can use list.files()
>>>>
>>>> Example:
>>>> list.files(pattern=".txt")
>>>> #[1] "file1.txt" "file2.txt" "file3.txt"
>>>>  lst2<-lapply(list.files(pattern=".txt"),function(x)
>>>> read.table(x,sep="",header=TRUE))
>>>> lst2[[1]]
>>>> #  col1 col2
>>>> #1    1  0.5
>>>> #2    2  0.2
>>>> #3    3  0.3
>>>> #4    4  0.3
>>>> #5    5  0.1
>>>> #6    6  0.2
>>>> library(abind)
>>>> apply(abind(lst2,along=3),c(1,2),mean)
>>>> #     col1      col2
>>>> #[1,]    3 0.5000000
>>>> #[2,]    4 0.4666667
>>>> #[3,]    5 0.5666667
>>>> #[4,]    6 0.2666667
>>>> #[5,]    7 0.4000000
>>>> #[6,]    8 0.2666667
>>>> A.K.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: arun <smartpink111 at yahoo.com>
>>>> To: Silvano Cesar da Costa <silvano at uel.br>
>>>> Cc: R help <r-help at r-project.org>
>>>> Sent: Wednesday, April 10, 2013 6:28 PM
>>>> Subject: Re: [R] means in tables
>>>>
>>>> Hi,
>>>>
>>>>  YOu can load all the datasets directly from the directory in a list.
>>>>
>>>>
>>>>
>>>> set.seed(25)
>>>>  lst1<-lapply(1:100,function(i)
>>>> as.data.frame(matrix(sample(1:40,25,replace=TRUE),ncol=5)))
>>>>  length(lst1)
>>>> #[1] 100
>>>> library(abind)
>>>>
>>>> apply(abind(lst1,along=3),c(1,2),mean)
>>>> #        V1    V2    V3    V4    V5
>>>> #[1,] 20.37 21.95 19.51 22.77 22.00
>>>> #[2,] 20.43 17.94 18.81 20.02 23.86
>>>> #[3,] 23.00 18.64 21.15 21.61 22.12
>>>> #[4,] 20.10 20.89 22.35 19.62 20.72
>>>> #[5,] 19.36 20.97 19.36 21.02 20.48
>>>>
>>>>   mean(unlist(lapply(lst1,function(x) x[1,1])))
>>>> #[1] 20.37
>>>> mean(unlist(lapply(lst1,function(x) x[4,5])))
>>>> #[1] 20.72
>>>>  mean(unlist(lapply(lst1,function(x) x[5,2])))
>>>> #[1] 20.97
>>>>
>>>>
>>>> A.K.
>>>>
>>>> ----- Original Message -----
>>>> From: Silvano Cesar da Costa <silvano at uel.br>
>>>> To: arun <smartpink111 at yahoo.com>
>>>> Cc:
>>>> Sent: Wednesday, April 10, 2013 6:02 PM
>>>> Subject: Re: [R] means in tables
>>>>
>>>> Hi Arun,
>>>>
>>>> I thought with an example with two tables I could generalize to the
>>>> 100
>>>> tables that have. It did not work.
>>>>
>>>> Actually have 100 tables in the format mentioned. I need to calculate
>>>> the
>>>> average of the elements that are in the same position in the 100
>>>> tables.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Hi,
>>>>> This could be done in different ways:
>>>>> tab1<-read.table(text="
>>>>> V1  V2  V3  V4  V5
>>>>> 14.23 1.71 2.43 15.6 127
>>>>> 13.20 1.78 2.14 11.2 100
>>>>> 13.16 2.36 2.67 18.6 101
>>>>> 14.37 1.95 2.50 16.8 113
>>>>> 13.24 2.59 2.87 21.0 118
>>>>> ",sep="",header=TRUE)
>>>>> tab2<-read.table(text="
>>>>> V1  V2  V3  V4  V5
>>>>> 1.23 1.1 2.3 1.6 17
>>>>> 1.20 1.8 2.4 1.2 10
>>>>> 1.16 2.6 2.7 1.6 11
>>>>> 1.37 1.5 2.0 1.8 13
>>>>> 1.24 2.9 2.7 2.0 18
>>>>> ",sep="",header=TRUE)
>>>>>
>>>>>
>>>>> (tab1+tab2)/2
>>>>> #    V1    V2    V3   V4 V5
>>>>> #1 7.73 1.405 2.365  8.6 72
>>>>> #2 7.20 1.790 2.270  6.2 55
>>>>> #3 7.16 2.480 2.685 10.1 56
>>>>> #4 7.87 1.725 2.250  9.3 63
>>>>> #5 7.24 2.745 2.785 11.5 68
>>>>>
>>>>>
>>>>> #or
>>>>> library(abind)
>>>>>  apply(abind(list(tab1,tab2),along=3),c(1,2),mean)
>>>>> #       V1    V2    V3   V4 V5
>>>>> #[1,] 7.73 1.405 2.365  8.6 72
>>>>> #[2,] 7.20 1.790 2.270  6.2 55
>>>>> #[3,] 7.16 2.480 2.685 10.1 56
>>>>> #[4,] 7.87 1.725 2.250  9.3 63
>>>>> #[5,] 7.24 2.745 2.785 11.5 68
>>>>>
>>>>>
>>>>> #or
>>>>>
>>>>> library(plyr)
>>>>> dcast(adply(abind(list(tab1,tab2),along=3),c(1,2),mean),X1~X2,value.var="V1")[,-1]
>>>>> #    V1    V2    V3   V4 V5
>>>>> #1 7.73 1.405 2.365  8.6 72
>>>>> #2 7.20 1.790 2.270  6.2 55
>>>>> #3 7.16 2.480 2.685 10.1 56
>>>>> #4 7.87 1.725 2.250  9.3 63
>>>>> #5 7.24 2.745 2.785 11.5 68
>>>>>
>>>>> #or
>>>>> aaply(abind(list(tab1,tab2),along=3),c(1,2),mean)
>>>>> #   X2
>>>>> #X1    V1    V2    V3   V4 V5
>>>>>  # 1 7.73 1.405 2.365  8.6 72
>>>>>  # 2 7.20 1.790 2.270  6.2 55
>>>>>  # 3 7.16 2.480 2.685 10.1 56
>>>>>  # 4 7.87 1.725 2.250  9.3 63
>>>>>  # 5 7.24 2.745 2.785 11.5 68
>>>>>
>>>>>
>>>>> A.K.
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>> From: Silvano Cesar da Costa <silvano at uel.br>
>>>>> To: r-help at r-project.org
>>>>> Cc:
>>>>> Sent: Wednesday, April 10, 2013 12:07 PM
>>>>> Subject: [R] means in tables
>>>>>
>>>>> Hi.
>>>>>
>>>>> I have 2 tables, with same dimensions (8000 x 5). Something like:
>>>>>
>>>>> tab1:
>>>>>
>>>>> V1   V2   V3   V4  V5
>>>>> 14.23 1.71 2.43 15.6 127
>>>>> 13.20 1.78 2.14 11.2 100
>>>>> 13.16 2.36 2.67 18.6 101
>>>>> 14.37 1.95 2.50 16.8 113
>>>>> 13.24 2.59 2.87 21.0 118
>>>>>
>>>>> tab2:
>>>>>
>>>>> V1   V2   V3   V4  V5
>>>>> 1.23 1.1 2.3 1.6 17
>>>>> 1.20 1.8 2.4 1.2 10
>>>>> 1.16 2.6 2.7 1.6 11
>>>>> 1.37 1.5 2.0 1.8 13
>>>>> 1.24 2.9 2.7 2.0 18
>>>>>
>>>>> I need generate a table of averages, the elements in the same
>>>>> position
>>>>> in
>>>>> both tables, like:
>>>>>
>>>>> tab3:
>>>>> (14.23 + 1.23)/2  (1.71+1.1)/2   (127+17)/2
>>>>>
>>>>> and so on
>>>>>
>>>>> I tried the program:
>>>>>
>>>>> Médias = matrix(NA, nrow(tab1), ncol(tab1))
>>>>> for(i in 1:nrow(tab1)){
>>>>>   for(j in 1:ncol(tab1)){
>>>>>     for(k in 1:nrow(tab2)){
>>>>>       for(l in 1:ncol(tab2)){
>>>>>         Médias = tab1$i[j]
>>>>>       }}}}
>>>>>
>>>>> Médias
>>>>>
>>>>> but it does't  work. I don't know programming.
>>>>>
>>>>> How can I do this?
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>> ---------------------------------------------
>>>>> Silvano Cesar da Costa
>>>>>
>>>>> Universidade Estadual de Londrina
>>>>> Centro de Ciências Exatas
>>>>> Departamento de Estatística
>>>>>
>>>>> Fone: (43) 3371-4346
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------
>>>> Silvano Cesar da Costa
>>>>
>>>> Universidade Estadual de Londrina
>>>> Centro de Ciências Exatas
>>>> Departamento de Estatística
>>>>
>>>> Fone: (43) 3371-4346
>>>> ---------------------------------------------
>>>>
>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------
>>> Silvano Cesar da Costa
>>>
>>> Universidade Estadual de Londrina
>>> Centro de Ciências Exatas
>>> Departamento de Estatística
>>>
>>> Fone: (43) 3371-4346
>>> ---------------------------------------------
>>>
>>>
>>
>>
>> ---------------------------------------------
>> Silvano Cesar da Costa
>>
>> Universidade Estadual de Londrina
>> Centro de Ciências Exatas
>> Departamento de Estatística
>>
>> Fone: (43) 3371-4346
>> ---------------------------------------------
>>
>>
>
>
> ---------------------------------------------
> Silvano Cesar da Costa
>
> Universidade Estadual de Londrina
> Centro de Ciências Exatas
> Departamento de Estatística
>
> Fone: (43) 3371-4346
> ---------------------------------------------
>
>


---------------------------------------------
Silvano Cesar da Costa

Universidade Estadual de Londrina
Centro de Ciências Exatas
Departamento de Estatística

Fone: (43) 3371-4346
---------------------------------------------



More information about the R-help mailing list