[R] How can I avoid the for and If loops in my function?
Martin Maechler
maechler at stat.math.ethz.ch
Thu Jun 19 16:50:15 CEST 2014
>>>>> lmramba <lmramba at ufl.edu>
>>>>> on Wed, 18 Jun 2014 20:00:15 +0000 writes:
> Hi Jim. If I avoid the dataframe, how can use the function model.matrix() to
> build the incident matrices X, And Z? I tried saving the design as matrix
> but ghen I got the wrong design matrix.
I think you are entirely right here, Laz.
That indeed you have data frame and a formula --> model.matrix()
to get the matrix.
I have no time currently to delve into your
example, and I see
- it is not reproducible {you use a function
varG() that is undefined}
- you use foreach just so you can use %do% in one place which I
think makes no sense
- you use package 'gmp' which I don't think you'd use, but I
don't know as your code is not reproducible ....
- you use "<<-" in quite a few places in your code, which is
considered really bad programming style and makes it very hard
to understand the code by reading it ...
... *but* .. after all that ...
...
as maintainer of the Matrix package
I'm close to absolutely sure that you want to work with *sparse*
matrices as Matrix provides.
So in fact, do use
## "require": just so you are not tempted to call a package a "library"
require(Matrix)
help(sparse.model.matrix)
and then do
- use sparse.model.matrix() instead of model.matrix().
Further, do
- use Diagonal() instead of diag() for *constructing* diagonal matrices.
Please let us know if this helps
[and maybe fix your example to become reproducible: do
rm(list=ls(all=TRUE))
before source(...) ing the reproducible example script...
]
Martin Maechler, ETH Zurich
> Thanks.
> Laz
> Sent from my LG Optimus G™, an AT&T 4G LTE smartphone
> ------ Original message ------
> From: jim holtman
> Date: 6/18/2014 3:49 PM
> To: Laz;
> Cc: R mailing list;
> Subject:Re: [R] How can I avoid the for and If loops in my function?
> First order of business, without looking in detail at the code, is to avoid
> the use of dataframes. If all your values are numerics, then use a matrix.
> It will be faster execution.
> I did see the following statements:
> newmatdf<-Des[[i]]
> Des[[i]]<-newmatdf
> why are you just putting back what you pulled out of the list?
> Jim Holtman
> Data Munger Guru
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
> On Wed, Jun 18, 2014 at 12:41 PM, Laz <[1]lmramba at ufl.edu> wrote:
> Dear R-users,
> I have a 3200 by 3200 matrix that was build from a data frame that had
> 180 observations, with variables: x, y, blocks (6 blocks) and
> treatments (values range from 1 to 180) I am working on. I build other
> functions that seem to work well. However, I have one function that has
> many If loops and a long For loop that delays my results for over 10
> hours ! I need your help to avoid these loops.
> ########################################################
> ## I need to avoid these for loops and if loops here :
> ########################################################
> ### swapsimple() is a function that takes in a dataframe, randomly swaps
> two elements from the same block in a data frame and generates a new
> dataframe called newmatdf
> ### swapmainF() is a function that calculates the trace of the final N
> by N matrix considering the incident matrices and blocks and treatments
> and residual errors in a linear mixed model framework using Henderson
> approach.
> funF<- function(newmatdf, n, traceI)
> {
> # n = number of iterations (swaps to be made on pairs of elements of the
> dataframe, called newmatdf)
> # newmatdf : is the original dataframe with N rows, and 4 variables
> (x,y,blocks,genotypes)
> matrix0<-newmatdf
> trace<-traceI ## sum of the diagonal elements of the N by N matrix
> (generated outside this loop) from the original newmatdf dataframe
> res <- list(mat = NULL, Design_best = newmatdf, Original_design =
> matrix0) # store our output of interest
> res$mat <- rbind(res$mat, c(value = trace, iterations = 0)) #
> initialized values
> Des<-list()
> for(i in seq_len(n)){
> ifelse(i==1,
> newmatdf<-swapsimple(matrix0),newmatdf<-swapsimple(newmatdf))
> Des[[i]]<-newmatdf
> if(swapmainF(newmatdf) < trace){
> newmatdf<-Des[[i]]
> Des[[i]]<-newmatdf
> trace<- swapmainF(newmatdf)
> res$mat <- rbind(res$mat, c(trace = trace, iterations = i))
> res$Design_best <- newmatdf
> }
> if(swapmainF(newmatdf) > trace & nrow(res$mat)<=1){
> newmatdf<-matrix0
> Des[[i]]<-matrix0
> res$Design_best<-matrix0
> }
> if(swapmainF(newmatdf)> trace & nrow(res$mat)>1){
> newmatdf<-Des[[length(Des)-1]]
> Des[[i]]<-newmatdf
> res$Design_best<-newmatdf
> }
> }
> res
> }
> The above function was created to:
> Take an original matrix, called matrix0, calculate its trace.
> Generate a new matrix, called newmatdf after swapping two elements of the
> old one and calculate the trace. If the trace of the newmatrix is
> smaller than
> that of the previous matrix, store both the current trace together
> with the older trace and their iteration values. If the newer matrix has
> a trace larger than the previous trace, drop this trace and drop this
> matrix too (but count its iteration).
> Re-swap the old matrix that you stored previously and recalculate the
> trace. Repeat the
> process many times, say 10,000. The final results should be a list
> with the original initial matrix and its trace, the final best
> matrix that had the smallest trace after the 10000 simulations and a
> dataframe showing the values of the accepted traces that
> were smaller than the previous and their respective iterations.
> $Original_design
> x y block genotypes
> 1 1 1 1 29
> 7 1 2 1 2
> 13 1 3 1 8
> 19 1 4 1 10
> 25 1 5 1 9
> 31 1 6 2 29
> 37 1 7 2 4
> 43 1 8 2 22
> 49 1 9 2 3
> 55 1 10 2 26
> 61 1 11 3 18
> 67 1 12 3 19
> 73 1 13 3 28
> 79 1 14 3 10
> ------truncated ----
> the final results after running funF<-
> function(newmatdf,n,traceI) given below looks like this:
> ans1
> $mat
> value iterations
> [1,] 1.474952 0
> [2,] 1.474748 1
> [3,] 1.474590 2
> [4,] 1.474473 3
> [5,] 1.474411 5
> [6,] 1.474294 10
> [7,] 1.474182 16
> [8,] 1.474058 17
> [9,] 1.473998 19
> [10,] 1.473993 22
> ---truncated
> $Design_best
> x y block genotypes
> 1 1 1 1 29
> 7 1 2 1 2
> 13 1 3 1 18
> 19 1 4 1 10
> 25 1 5 1 9
> 31 1 6 2 29
> 37 1 7 2 21
> 43 1 8 2 6
> 49 1 9 2 3
> 55 1 10 2 26
> ---- truncated
> $Original_design
> x y block genotypes
> 1 1 1 1 29
> 7 1 2 1 2
> 13 1 3 1 8
> 19 1 4 1 10
> 25 1 5 1 9
> 31 1 6 2 29
> 37 1 7 2 4
> 43 1 8 2 22
> 49 1 9 2 3
> 55 1 10 2 26
> 61 1 11 3 18
> 67 1 12 3 19
> 73 1 13 3 28
> 79 1 14 3 10
> ------truncated
> Regards,
> Laz
> [[alternative HTML version deleted]]
> ______________________________________________
> [2]R-help at r-project.org mailing list
> [3]https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting
> guide [4]http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> References
> 1. mailto:lmramba at ufl.edu
> 2. mailto:R-help at r-project.org
> 3. https://stat.ethz.ch/mailman/listinfo/r-help
> 4. http://www.R-project.org/posting-guide.html
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list