[R] A. Mani : Tapply
Petr Pikal
petr.pikal at precheza.cz
Fri Aug 26 10:23:43 CEST 2005
Hi
> PLEASE do read the posting guide!
******************************
On 25 Aug 2005 at 20:28, A Mani wrote:
> Hello,
> Is it safe to use tapply when the result will be of dim 20000
> x 10000 or more ? In my PC R crashes.
or gave you an error message?
I tried this
> df<-data.frame(A=as.factor(sample(1:10000,100000,
replace=T)),B=as.factor(sample(100000:110000,100000,
replace=T)), num=rnorm(100000))
> ttt<-tapply(df$num, list(df$A,df$B), diff)
and got
Error: cannot allocate vector of size 390664 Kb
In addition: Warning messages:
1: Reached total allocation of 1000Mb: see help(memory.size)
2: Reached total allocation of 1000Mb: see help(memory.size)
>
and the result with smaller data sets are
df1<-data.frame(A=as.factor(sample(1:2,100000,
replace=T)),B=as.factor(sample(10:11,100000, replace=T)),
num=rnorm(100000))
ttt1<-tapply(df1$num, list(df1$A,df1$B), diff)
> ttt1
10 11
1 Numeric,24933 Numeric,25141
2 Numeric,24992 Numeric,24930
df<-data.frame(A=as.factor(sample(1:1000,100000,
replace=T)),B=as.factor(sample(10000:11000,100000,
replace=T)), num=rnorm(100000))
ttt<-tapply(df$num, list(df$A,df$B), diff)
>
> ttt[1:2,1:5]
10000 10001 10002 10003 10004
1 NULL NULL NULL Numeric,0 NULL
2 NULL NULL NULL Numeric,0 NULL
>
so you are probably receiving humonguous table of NULLs, zeros
and few nonzero entries.
You probably need to use different approach
Cheers
Petr
The code used was on a 3-col
> data frame with two factor cols and a numeric column. The fn
was diff
> . data form being <A, B, Num> tapply(data$A, list(data$A,
data$B),
> diff)
>
> --
> A. Mani
> Member, Cal. Math. Soc
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
Petr Pikal
petr.pikal at precheza.cz
More information about the R-help
mailing list