[R] Replacing for loop with tapply!?

Sander Oom slist at oomvanlieshout.net
Fri Jun 10 12:10:17 CEST 2005


Thanks Dimitris,

Very impressive! Much faster than before.

Thanks to new found R.basic, I can simply rotate the result with 
rotate270{R.basic}:

 > mat <- matrix(sample(-15:50, 365 * 15000, TRUE), 365, 15000)
 > temps <- c(37, 39, 41)
 > #################
 > #ind <- matrix(0, length(temps), ncol(mat))
 > ind <- matrix(0, 4, ncol(mat))
 > (startDate <- date())
[1] "Fri Jun 10 12:08:01 2005"
 > for(i in seq(along = temps)) ind[i, ] <- colSums(mat > temps[i])
 > ind[4, ] <- colMeans(max(mat))
Error in colMeans(max(mat)) : 'x' must be an array of at least two 
dimensions
 > (endDate <- date())
[1] "Fri Jun 10 12:08:02 2005"
 > ind <- rotate270(ind)
 > ind[1:10,]
    V4 V3 V2 V1
1   0 56 75 80
2   0 46 53 60
3   0 50 58 67
4   0 60 72 80
5   0 59 68 76
6   0 55 67 74
7   0 62 77 93
8   0 45 57 67
9   0 57 68 75
10  0 61 66 76

However, I have not managed to get the row maximum using your method? It 
should be 50 for most rows, but my first guess code gives an error!

Any suggestions?

Sander



Dimitris Rizopoulos wrote:
> maybe you are looking for something along these lines:
> 
> mat <- matrix(sample(-15:50, 365 * 15000, TRUE), 365, 15000)
> temps <- c(37, 39, 41)
> #################
> ind <- matrix(0, length(temps), ncol(mat))
> for(i in seq(along = temps)) ind[i, ] <- colSums(mat > temps[i])
> ind
> 
> 
> I hope it helps.
> 
> Best,
> Dimitris
> 
> ----
> Dimitris Rizopoulos
> Ph.D. Student
> Biostatistical Centre
> School of Public Health
> Catholic University of Leuven
> 
> Address: Kapucijnenvoer 35, Leuven, Belgium
> Tel: +32/16/336899
> Fax: +32/16/337015
> Web: http://www.med.kuleuven.ac.be/biostat/
>      http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm
> 
> 
> ----- Original Message ----- 
> From: "Sander Oom" <slist at oomvanlieshout.net>
> To: <r-help at stat.math.ethz.ch>
> Sent: Friday, June 10, 2005 10:50 AM
> Subject: [R] Replacing for loop with tapply!?
> 
> 
>>Dear all,
>>
>>We have a large data set with temperature data for weather stations
>>across the globe (15000 stations).
>>
>>For each station, we need to calculate the number of days a certain
>>temperature is exceeded.
>>
>>So far we used the following S code, where mat88 is a matrix 
>>containing
>>rows of 365 daily temperatures for each of 15000 weather stations:
>>
>>m <- 37
>>n <- 2
>>outmat88 <- matrix(0, ncol = 4, nrow = nrow(mat88))
>>for(i in 1:nrow(mat88)) {
>># i <- 3
>>row1 <- as.data.frame(df88[i,  ])
>>temprow37 <- select.rows(row1, row1 > m)
>>temprow39 <- select.rows(row1, row1 > m + n)
>>temprow41 <- select.rows(row1, row1 > m + 2 * n)
>>outmat88[i, 1] <- max(row1, na.rm = T)
>>outmat88[i, 2] <- count.rows(temprow37)
>>outmat88[i, 3] <- count.rows(temprow39)
>>outmat88[i, 4] <- count.rows(temprow41)
>>}
>>outmat88
>>
>>We have transferred the data to a more potent Linux box running R, 
>>but
>>still hope to speed up the code.
>>
>>I know a for loop should be avoided when looking for speed. I also 
>>know
>>the answer is in something like tapply, but my understanding of 
>>these
>>commands is still to limited to see the solution. Could someone show 
>>me
>>the way!?
>>
>>Thanks in advance,
>>
>>Sander.
>>-- 
>>--------------------------------------------
>>Dr Sander P. Oom
>>Animal, Plant and Environmental Sciences,
>>University of the Witwatersrand
>>Private Bag 3, Wits 2050, South Africa
>>Tel (work)      +27 (0)11 717 64 04
>>Tel (home)      +27 (0)18 297 44 51
>>Fax             +27 (0)18 299 24 64
>>Email   sander at oomvanlieshout.net
>>Web     www.oomvanlieshout.net/sander
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! 
>>http://www.R-project.org/posting-guide.html
>>
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 


-- 
--------------------------------------------
Dr Sander P. Oom
Animal, Plant and Environmental Sciences,
University of the Witwatersrand
Private Bag 3, Wits 2050, South Africa
Tel (work)      +27 (0)11 717 64 04
Tel (home)      +27 (0)18 297 44 51
Fax             +27 (0)18 299 24 64
Email   sander at oomvanlieshout.net
Web     www.oomvanlieshout.net/sander




More information about the R-help mailing list