[R] performance problem

Alexandre Fayolle Alexandre.Fayolle at logilab.fr
Thu Dec 14 16:13:43 CET 2000


Hello,

I needed a function like table(), but which used the value of a column
instead of counting occurences, but could not find anything in the
builtin modules (maybe I missed it...). SO I decided to write my own, and 
I came up with the following:

table.ponderate<-function(arow,acol,aweight){
	matrix(data=0,nrow=length(levels(arow)),ncol=length(levels(acol)),
	byrow=TRUE,dimnames=list(levels(arow),levels(acol)))->m
	aweight[is.na(aweight)]<- 0
	for (a in seq(length(arow))) {	
		prev<-m[as.integer(arow[a]),as.integer(acol[a])]

m[as.integer(arow[a]),as.integer(acol[a])]<-prev+aweight[a]
	}
	m
}

The problem is that the performance is very poor. I have not had time to
benchmark it, but it takes several seconds to process 1000 lines, and I
need to process a few dozens of 10000+ lines data sets. 

Is there a way to write things differently that could speed up things?

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list