[R] how to use vectorization instead of for loop

Dalthorp, Daniel ddalthorp at usgs.gov
Mon Mar 21 23:09:50 CET 2016


or simpler and faster:

dat[,4] <- sign(dat[,2])/dat[,3] # your original loop

dat <- cbind(dat, dat[,2] == Inf)  # append a new column with indicator for
which rows have dat[,2] = Inf


On Mon, Mar 21, 2016 at 2:45 PM, <ruipbarradas at sapo.pt> wrote:

> Hello,
>
> Use combined ifelses, more or less like the following.
>
> ifelse(dat[, 2] == Inf, do this, ifelse(dat[, 2] > 0, 1 * (1/dat[,3]),
> -1* (1/dat[,3])))
>
> Rui Barradas
>
>
> Citando Stephen HK WONG <honkit at stanford.edu>:
>
> > So much thanks Rui, the code can be so simple and fast.
> >
> > By the way, ifelse is good for two conditions, in my case, either
> > >0, or <0, I found there's a lot of row with value "Inf", I want to
> > keep it in new column, how do I do that using ifelse ?
> >
> > Thanks.
> >
> > ________________________________________
> > From: ruipbarradas at sapo.pt <ruipbarradas at sapo.pt>
> > Sent: Monday, March 21, 2016 11:50 AM
> > To: Stephen HK WONG
> > Cc: r-help at r-project.org
> > Subject: Re: [R] how to use vectorization instead of for loop
> >
> > Hello,
> >
> > I've renamed your dataframe to 'dat'. Since ?ifelse is vectorized, try
> >
> > dat[, 4] <- ifelse(dat[, 2] > 0, 1 * (1/dat[,3]), -1* (1/dat[,3]))
> >
> > Oh, and why do you multiply by 1 and by -1?
> > It would simply be 1/dat[,3] and -1/dat[,3].
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > Quoting Stephen HK WONG <honkit at stanford.edu>:
> >> Dear All,
> >>
> >> I have a dataframe like below but with many thousands rows,
> >>
> >> structure(list(gene_id = structure(1:6, .Label = c("0610005C13Rik",
> >> "0610007P14Rik", "0610009B22Rik", "0610009L18Rik", "0610009O20Rik",
> >> "0610010B08Rik,OTTMUSG00000016609"), class = "factor"),
> >> log2.fold_change. = c(0.0114463,
> >> -0.0960262, 0.00805151, -0.179981, -0.0629098, 0.155979), p_value = c(1,
> >> 0.77915, 0.98265, 0.68665, 0.85035, 0.72235), new.value = c("NA",
> >> "NA", "NA", "NA", "NA", "NA")), .Names = c("gene_id",
> "log2.fold_change.",
> >> "p_value", "new.value"), row.names = c(NA, 6L), class = "data.frame")
> >>
> >> I want to check if second column is positive or negative value, then
> >> I will do some calculation and put the new value in last column. I
> >> can do this with for loop like below but it is not efficient. Is
> >> there a better way to use a vectorization method instead of loop?
> >> Many thanks!
> >>
> >> for (i in 1:nrow(dataframe)) {
> >>
> >> if dataframe[i, 2]>0 {
> >>
> >> dataframe[i, 4]<- 1 * (1/dataframe[i,3])} else{
> >>
> >> dataframe[i, 4] <- -1* (1/dataframe[i,3])}
> >>
> >> }
> >>
> >> -------------------------------------------------------
> >>
> >> Stephen H.K. WONG, PhD.
> >>
> >> Stanford University
> >>
> >>       [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.htmland provide commented,
> >> minimal, self-contained, reproducible code.
> >
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




-- 
Dan Dalthorp, PhD
USGS Forest and Rangeland Ecosystem Science Center
Forest Sciences Lab, Rm 189
3200 SW Jefferson Way
Corvallis, OR 97331
ph: 541-750-0953
ddalthorp at usgs.gov

	[[alternative HTML version deleted]]



More information about the R-help mailing list