[R] avoiding too many loops - reshaping data
Dimitri Liakhovitski
dimitri.liakhovitski at gmail.com
Wed Nov 3 23:16:55 CET 2010
Here is the summary of methods. tapply is the fastest!
library(reshape)
system.time(for(i in 1:1000)cast(melt(mydf, measure.vars = "value"),
city ~ brand,fun.aggregate = sum))
user system elapsed
18.40 0.00 18.44
library(reshape2)
system.time(for(i in 1:1000)dcast(mydf,city ~ brand, sum))
user system elapsed
12.36 0.02 12.37
system.time(for(i in 1:1000)xtabs(value ~ city + brand, mydf))
user system elapsed
2.45 0.00 2.47
system.time(for(i in 1:1000)tapply(mydf$value,mydf[c('city','brand')],sum))
user system elapsed
0.78 0.00 0.79
Dimitri
On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote:
> Try this:
>
> xtabs(value ~ city + brand, mydf)
>
> On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>>
>> Hello!
>>
>> I have a data frame like this one:
>>
>>
>> mydf<-data.frame(city=c("a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b"),
>> brand=c("x","x","y","y","z","z","z","z","x","x","x","y","y","y","z","z"),
>> value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116))
>> (mydf)
>>
>> What I need to get is a data frame like the one below - cities as
>> rows, brands as columns, and the sums of the "value" within each
>> city/brand combination in the body of the data frame:
>>
>> city x y z
>> a 3 23 336
>> b 7 42 231
>>
>>
>> I have written a code that involves multiple loops and subindexing -
>> but it's taking too long.
>> I am sure there must be a more efficient way of doing it.
>>
>> Thanks a lot for your hints!
>>
>>
>> --
>> Dimitri Liakhovitski
>> Ninah Consulting
>> www.ninah.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com
More information about the R-help
mailing list