[R] How can I avoid the for and If loops in my function?
Mramba,Lazarus K
lmramba at ufl.edu
Thu Jun 19 17:04:43 CEST 2014
Hi Martin,
Thanks for the useful comments. It has greatly improved the efficiency
of my functions. I am sorry that I forgot to paste the functions varG
and others which I have done here for reproducibility.
#############################
## a function to calculate h2
## (heritability)
###############################
herit<-function(varG,varR=1)
{
h<-4*varG/(varG+varR)
h
}
###################################
# function to calculate random error
varR<-function(varG,h2)
{
varR<- varG*(4-h2)/h2
varR
}
##########################################
# function to calculate treatment variance
varG<-function(varR=1,h2)
{
varG<-varR*h2/(4-h2)
varG
}
###############################
I used the package "gmp" so that I can use the function "is.whole()".
I have managed to reduce the complexity of the function that I needed
help. It now takes half the time it used to take before, but it is
several hours and wish I can cut these hours down by doing away with the
for and if loops in this code.
Any help will be appreciated.
newfunF<- function(matrix0,n,traceI)
{
newmatdf<-matrix0
trace<-traceI
mat <- NULL
Design_best <- newmatdf
mat <- rbind(mat, c(value = trace, iterations = 0))
Des<-list()
newmatdf<-swapsimple(matrix0)
for(i in 2:n){
newmatdf<-swapsimple(newmatdf)
Des[[i]]<-newmatdf
if(swapmainF(newmatdf) < trace){
Design_best <- Des[[i]]<-newmatdf
trace<- swapmainF(newmatdf)
mat <- rbind(mat, c(trace = trace, iterations = i))
}
}
list(mat=mat,Design_best=Design_best)
}
On Thu, 19 Jun 2014 16:50:15 +0200, Martin Maechler wrote:
>>>>>> lmramba <lmramba at ufl.edu>
>>>>>> on Wed, 18 Jun 2014 20:00:15 +0000 writes:
>
> > Hi Jim. If I avoid the dataframe, how can use the function
> model.matrix() to
> > build the incident matrices X, And Z? I tried saving the design
> as matrix
> > but ghen I got the wrong design matrix.
>
> I think you are entirely right here, Laz.
> That indeed you have data frame and a formula --> model.matrix()
> to get the matrix.
>
> I have no time currently to delve into your
> example, and I see
> - it is not reproducible {you use a function
> varG() that is undefined}
> - you use foreach just so you can use %do% in one place which I
> think makes no sense
> - you use package 'gmp' which I don't think you'd use, but I
> don't know as your code is not reproducible ....
> - you use "<<-" in quite a few places in your code, which is
> considered really bad programming style and makes it very hard
> to understand the code by reading it ...
>
> ... *but* .. after all that ...
> ...
> as maintainer of the Matrix package
> I'm close to absolutely sure that you want to work with *sparse*
> matrices as Matrix provides.
>
> So in fact, do use
>
> ## "require": just so you are not tempted to call a package a
> "library"
> require(Matrix)
>
> help(sparse.model.matrix)
>
> and then do
> - use sparse.model.matrix() instead of model.matrix().
>
> Further, do
> - use Diagonal() instead of diag() for *constructing* diagonal
> matrices.
>
> Please let us know if this helps
>
> [and maybe fix your example to become reproducible: do
>
> rm(list=ls(all=TRUE))
>
> before source(...) ing the reproducible example script...
> ]
>
> Martin Maechler, ETH Zurich
>
>
> > Thanks.
>
> > Laz
>
>
> > Sent from my LG Optimus G™, an AT&T 4G LTE smartphone
> > ------ Original message ------
> > From: jim holtman
> > Date: 6/18/2014 3:49 PM
> > To: Laz;
> > Cc: R mailing list;
> > Subject:Re: [R] How can I avoid the for and If loops in my
> function?
>
> > First order of business, without looking in detail at the code,
> is to avoid
> > the use of dataframes. If all your values are numerics, then
> use a matrix.
> > It will be faster execution.
> > I did see the following statements:
> > newmatdf<-Des[[i]]
> > Des[[i]]<-newmatdf
> > why are you just putting back what you pulled out of the list?
>
> > Jim Holtman
> > Data Munger Guru
>
> > What is the problem that you are trying to solve?
> > Tell me what you want to do, not how you want to do it.
> > On Wed, Jun 18, 2014 at 12:41 PM, Laz <[1]lmramba at ufl.edu>
> wrote:
>
> > Dear R-users,
> > I have a 3200 by 3200 matrix that was build from a data frame
> that had
> > 180 observations, with variables: x, y, blocks (6 blocks) and
> > treatments (values range from 1 to 180) I am working on. I
> build other
> > functions that seem to work well. However, I have one function
> that has
> > many If loops and a long For loop that delays my results for
> over 10
> > hours ! I need your help to avoid these loops.
> > ########################################################
> > ## I need to avoid these for loops and if loops here :
> > ########################################################
> > ### swapsimple() is a function that takes in a dataframe,
> randomly swaps
> > two elements from the same block in a data frame and generates
> a new
> > dataframe called newmatdf
> > ### swapmainF() is a function that calculates the trace of the
> final N
> > by N matrix considering the incident matrices and blocks and
> treatments
> > and residual errors in a linear mixed model framework using
> Henderson
> > approach.
> > funF<- function(newmatdf, n, traceI)
> > {
> > # n = number of iterations (swaps to be made on pairs of
> elements of the
> > dataframe, called newmatdf)
> > # newmatdf : is the original dataframe with N rows, and 4
> variables
> > (x,y,blocks,genotypes)
> > matrix0<-newmatdf
> > trace<-traceI ## sum of the diagonal elements of the N by N
> matrix
> > (generated outside this loop) from the original newmatdf
> dataframe
> > res <- list(mat = NULL, Design_best = newmatdf, Original_design
> =
> > matrix0) # store our output of interest
> > res$mat <- rbind(res$mat, c(value = trace, iterations = 0)) #
> > initialized values
> > Des<-list()
> > for(i in seq_len(n)){
> > ifelse(i==1,
> > newmatdf<-swapsimple(matrix0),newmatdf<-swapsimple(newmatdf))
> > Des[[i]]<-newmatdf
> > if(swapmainF(newmatdf) < trace){
> > newmatdf<-Des[[i]]
> > Des[[i]]<-newmatdf
> > trace<- swapmainF(newmatdf)
> > res$mat <- rbind(res$mat, c(trace = trace, iterations = i))
> > res$Design_best <- newmatdf
> > }
> > if(swapmainF(newmatdf) > trace & nrow(res$mat)<=1){
> > newmatdf<-matrix0
> > Des[[i]]<-matrix0
> > res$Design_best<-matrix0
> > }
> > if(swapmainF(newmatdf)> trace & nrow(res$mat)>1){
> > newmatdf<-Des[[length(Des)-1]]
> > Des[[i]]<-newmatdf
> > res$Design_best<-newmatdf
> > }
> > }
> > res
> > }
> > The above function was created to:
> > Take an original matrix, called matrix0, calculate its trace.
> > Generate a new matrix, called newmatdf after swapping two
> elements of the
> > old one and calculate the trace. If the trace of the newmatrix
> is
> > smaller than
> > that of the previous matrix, store both the current trace
> together
> > with the older trace and their iteration values. If the newer
> matrix has
> > a trace larger than the previous trace, drop this trace and
> drop this
> > matrix too (but count its iteration).
> > Re-swap the old matrix that you stored previously and
> recalculate the
> > trace. Repeat the
> > process many times, say 10,000. The final results should be a
> list
> > with the original initial matrix and its trace, the final best
> > matrix that had the smallest trace after the 10000 simulations
> and a
> > dataframe showing the values of the accepted traces that
> > were smaller than the previous and their respective iterations.
> > $Original_design
> > x y block genotypes
> > 1 1 1 1 29
> > 7 1 2 1 2
> > 13 1 3 1 8
> > 19 1 4 1 10
> > 25 1 5 1 9
> > 31 1 6 2 29
> > 37 1 7 2 4
> > 43 1 8 2 22
> > 49 1 9 2 3
> > 55 1 10 2 26
> > 61 1 11 3 18
> > 67 1 12 3 19
> > 73 1 13 3 28
> > 79 1 14 3 10
> > ------truncated ----
> > the final results after running funF<-
> > function(newmatdf,n,traceI) given below looks like this:
> > ans1
> > $mat
> > value iterations
> > [1,] 1.474952 0
> > [2,] 1.474748 1
> > [3,] 1.474590 2
> > [4,] 1.474473 3
> > [5,] 1.474411 5
> > [6,] 1.474294 10
> > [7,] 1.474182 16
> > [8,] 1.474058 17
> > [9,] 1.473998 19
> > [10,] 1.473993 22
> > ---truncated
> > $Design_best
> > x y block genotypes
> > 1 1 1 1 29
> > 7 1 2 1 2
> > 13 1 3 1 18
> > 19 1 4 1 10
> > 25 1 5 1 9
> > 31 1 6 2 29
> > 37 1 7 2 21
> > 43 1 8 2 6
> > 49 1 9 2 3
> > 55 1 10 2 26
> > ---- truncated
> > $Original_design
> > x y block genotypes
> > 1 1 1 1 29
> > 7 1 2 1 2
> > 13 1 3 1 8
> > 19 1 4 1 10
> > 25 1 5 1 9
> > 31 1 6 2 29
> > 37 1 7 2 4
> > 43 1 8 2 22
> > 49 1 9 2 3
> > 55 1 10 2 26
> > 61 1 11 3 18
> > 67 1 12 3 19
> > 73 1 13 3 28
> > 79 1 14 3 10
> > ------truncated
> > Regards,
> > Laz
> > [[alternative HTML version deleted]]
> > ______________________________________________
> > [2]R-help at r-project.org mailing list
> > [3]https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting
> > guide [4]http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible
> code.
>
> > References
>
> > 1. mailto:lmramba at ufl.edu
> > 2. mailto:R-help at r-project.org
> > 3. https://stat.ethz.ch/mailman/listinfo/r-help
> > 4. http://www.R-project.org/posting-guide.html
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible
> code.
More information about the R-help
mailing list