<html><head></head><body lang="en-GB"><div dir="ltr"></div><div>Genial! A veces cuesta darse una vuelta pero iterar en la dimensión más pequeña siempre ahorra tiempo. Un saludo a todos! </div><div><br></div><div><br></div><div><br><div>2016-10-28 15:19 GMT+02:00 Carlos J. Gil Bellosta <span dir="ltr"><<a href="mailto:cgb@datanalytics.com" target="_blank">cgb@datanalytics.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Inspirado por Adolfo, otra vueltica de tuerca:<div><br></div><div>#matrices mejor que dfs</div><div><div>tmp <- as.matrix(dat)</div><div><br></div><div># min implementado a mano:</div><div>my.cols <- rep(ncol(tmp), nrow(tmp))<br></div><div>for (i in (ncol(tmp) - 1):1) </div><div>  my.cols[!<a href="http://is.na" target="_blank">is.na</a>(tmp[,i])] <- i</div><div><br></div><div># al canasto:</div><span class=""><div>my.values <- tmp[cbind(1:nrow(tmp), my.cols)]</div></span></div><div><br></div><div><br></div><div>Un saludo,</div><div><br></div><div>Carlos J. Gil Bellosta</div><div><a href="http://www.datanalytics.com" target="_blank">http://www.datanalytics.com</a></div></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">El 28 de octubre de 2016, 13:48, Adolfo Álvarez <span dir="ltr"><<a href="mailto:adalvarez@gmail.com" target="_blank">adalvarez@gmail.com</a>></span> escribió:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">Hola a todos, me ha gustado mucho la solución de Carlos, muy eficiente y<br>

muy ingeniosa al utilizar la funcion col() que o no la conocia o no me<br>

acordaba de ella.<br>

<br>

La parte mas "lenta" sigue siendo el apply que en el fondo no es mas que un<br>

ciclo for a traves de las filas, asi que inspirado por el metodo de Carlos<br>

pense que podria ser mas rapido si iteramos a traves de las columnas por lo<br>

que en general seran menos iteraciones. He incluido esta modificacion en el<br>

benchmark, es un poco menos elegante que la original de Carlos pero algo<br>

mas rapida. Seguro que aun se puede mejorar un poco mas en R base o<br>

incorporar Rcpp, pero creo que al menos por mi parte llego hasta aqui.<br>

<br>

Muy interesante tanto el problema como las soluciones propuestas, un saludo!<br>

Adolfo.<br>

<br>

library(microbenchmark)<br>

library(data.table)<br>

library(dplyr)<br>

library(tidyr)<br>

set.seed(123456)<br>

numero <- 1e5<br>

<span>N <- 1e1<br>

tabla <-<br>

  microbenchmark(<br>

</span><span>    JVG ={<br>

      dat <-<br>

        data.table( Uno    = sample( c(runif(numero) , rep(NA , numero /2e0<br>

        )) , size = numero ) ,<br>

        dos    = sample( c(runif(numero) , rep(NA , numero /1e1<br>

        )) , size = numero ) ,<br>

        tres   = sample( c(runif(numero) , rep(NA , numero /2e1<br>

        )) , size = numero ) ,<br>

        cuatro = sample( c(runif(numero) , rep(NA , numero /1e2<br>

        )) , size = numero ) ,<br>

        cinco  = sample( c(runif(numero) , rep(NA , numero /2e2<br>

        )) , size = numero ) ,<br>

        seis   = sample( c(runif(numero) , rep(NA , numero /1e3<br>

        )) , size = numero )<br>

        )<br>

</span>    First_month <-<br>

<div><div class="m_-4634798612048657397h5">      apply(X = dat,  MARGIN = 1, FUN =<br>

              function(x){<br>

                return(   min(  which( !<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(x)  ),  na.rm = TRUE ) )<br>

              }<br>

      )<br>

      dat[ , First_month := First_month]<br>

      N_for <- length( unique(First_month ))<br>

      for( j in 1:N_for){<br>

        x <- dat[  First_month == j,  j,  with = FALSE]<br>

        dat[ First_month == j , Value_First_month := x ]<br>

      }<br>

    },<br>

    Olivier ={<br>

      dat <-<br>

        data.table( Uno    = sample( c(runif(numero) , rep(NA , numero /2e0<br>

        )) , size = numero ) ,<br>

        dos    = sample( c(runif(numero) , rep(NA , numero /1e1<br>

        )) , size = numero ) ,<br>

        tres   = sample( c(runif(numero) , rep(NA , numero /2e1<br>

        )) , size = numero ) ,<br>

        cuatro = sample( c(runif(numero) , rep(NA , numero /1e2<br>

        )) , size = numero ) ,<br>

        cinco  = sample( c(runif(numero) , rep(NA , numero /2e2<br>

        )) , size = numero ) ,<br>

        seis   = sample( c(runif(numero) , rep(NA , numero /1e3<br>

        )) , size = numero )<br>

        )<br>

      dat[,First_month       := apply(X = .SD,MARGIN = 1,FUN = function(x)<br>

        colnames(.SD)[min(which(!<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a><wbr>(x)))])]<br>

      dat[,Value_First_month := apply(X = .SD,MARGIN = 1,FUN = function(x)<br>

        x[min(which(!<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(x)))])]<br>

    },<br>

    Olivier2={<br>

      dat <-<br>

        data.table( Uno    = sample( c(runif(numero) , rep(NA , numero /2e0<br>

        )) , size = numero ) ,<br>

        dos    = sample( c(runif(numero) , rep(NA , numero /1e1<br>

        )) , size = numero ) ,<br>

        tres   = sample( c(runif(numero) , rep(NA , numero /2e1<br>

        )) , size = numero ) ,<br>

        cuatro = sample( c(runif(numero) , rep(NA , numero /1e2<br>

        )) , size = numero ) ,<br>

        cinco  = sample( c(runif(numero) , rep(NA , numero /2e2<br>

        )) , size = numero ) ,<br>

        seis   = sample( c(runif(numero) , rep(NA , numero /1e3<br>

        )) , size = numero )<br>

        )<br>

<br>

      dat[,jugador:=1:.N]<br>

      dat2=melt(dat,id.vars="jugador<wbr>")<br>

      setkey(dat2,jugador)<br>

      dat2[,index:=min(which(!<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(<wbr>value))),by=jugador]<br>

      dat3 <- dat2[,list(First_month_Olivier<br>

                         =variable[index[1]],Value_Fir<wbr>st_month_Olivier<br>

=value[index[1]]),by=jugador]<br>

      setkey(x = dat, jugador)<br>

      dat0 <- merge( x = dat, y = dat3, all.x = TRUE, all.y = FALSE)<br>

<br>

    },<br>

<br>

</div></div>    Adolfo = {<br>

<span><br>

      dat <-<br>

        data.table( Uno    = sample( c(runif(numero) , rep(NA , numero /2e0<br>

        )) , size = numero ) ,<br>

        dos    = sample( c(runif(numero) , rep(NA , numero /1e1<br>

        )) , size = numero ) ,<br>

        tres   = sample( c(runif(numero) , rep(NA , numero /2e1<br>

        )) , size = numero ) ,<br>

        cuatro = sample( c(runif(numero) , rep(NA , numero /1e2<br>

        )) , size = numero ) ,<br>

        cinco  = sample( c(runif(numero) , rep(NA , numero /2e2<br>

        )) , size = numero ) ,<br>

        seis   = sample( c(runif(numero) , rep(NA , numero /1e3<br>

        )) , size = numero )<br>

        )<br>

</span><span>      # 1) Creamos una columna con la informacion de los jugadores,<br>

      # Como es un jugador por fila, hacemos 1:nrow.<br>

      step1 <- dat %>%<br>

        mutate(player = 1:nrow(dat))<br>

<br>

      #2) Convertimos las columnas de tiempo (uno, dos, tres, ...) en dos<br>

</span>      # columnas, mes y numero de juegos. (Ojo, asumimos que en los datos<br>

las<br>

      #                                    columnas estan ordenadas como en<br>

<span>el ejemplo, es decir uno, dos, tres y no<br>

</span>      #                                    tres, uno, dos)<br>

      #<br>

<span>      step2 <- gather(step1, month, games, -player)<br>

<br>

      #y 3) Filtramos los meses con NA y por cada jugador nos quedamos con<br>

</span>      # el primer dato:<br>

<span>        step3 <- step2 %>%<br>

        filter(!<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(games)) %>%<br>

        group_by(player) %>%<br>

        slice(1)<br>

</span>    },<br>

<br>

    Olivier3 = {<br>

<span>      dat <-<br>

        data.table( Uno    = sample( c(runif(numero) , rep(NA , numero /2e0<br>

        )) , size = numero ) ,<br>

        dos    = sample( c(runif(numero) , rep(NA , numero /1e1<br>

        )) , size = numero ) ,<br>

        tres   = sample( c(runif(numero) , rep(NA , numero /2e1<br>

        )) , size = numero ) ,<br>

        cuatro = sample( c(runif(numero) , rep(NA , numero /1e2<br>

        )) , size = numero ) ,<br>

        cinco  = sample( c(runif(numero) , rep(NA , numero /2e2<br>

        )) , size = numero ) ,<br>

        seis   = sample( c(runif(numero) , rep(NA , numero /1e3<br>

        )) , size = numero )<br>

        )<br>

</span><span>      M=as.matrix(dat)<br>

      index <- which(!<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(M)) - 1<br>

      meses<-colnames(M)<br>

      M2<- data.table(columna=index %/% nrow(M) +1L, jugador=index %%<br>

                        nrow(M) +1L , valor=M[index+1L])<br>

      setkey(M2,jugador,columna)<br>

<br>

<br>

</span><span>M2[,.(First_month=meses[column<wbr>a[1]],Value_First_month=valor[<wbr>1]),by=jugador]<br>

    },<br>

    GilBellosta = {<br>

<br>

      dat <-<br>

</span>        data.frame( Uno    = sample( c(runif(numero) , rep(NA , numero /2e0<br>

<span>        )) , size = numero ) ,<br>

        dos    = sample( c(runif(numero) , rep(NA , numero /1e1<br>

        )) , size = numero ) ,<br>

        tres   = sample( c(runif(numero) , rep(NA , numero /2e1<br>

        )) , size = numero ) ,<br>

        cuatro = sample( c(runif(numero) , rep(NA , numero /1e2<br>

        )) , size = numero ) ,<br>

        cinco  = sample( c(runif(numero) , rep(NA , numero /2e2<br>

        )) , size = numero ) ,<br>

        seis   = sample( c(runif(numero) , rep(NA , numero /1e3<br>

        )) , size = numero )<br>

        )<br>

</span><span>      tmp <- (as.matrix(dat))<br>

      cols <- col(tmp)<br>

      cols[<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(tmp)] <- Inf<br>

      my.cols <- apply(cols, 1, min)<br>

      my.values <- tmp[cbind(1:nrow(tmp), my.cols)]<br>

</span>    },<br>

    Adolfo2 = {<br>

      dat <-<br>

        data.frame( Uno    = sample( c(runif(numero) , rep(NA , numero /2e0<br>

<span>        )) , size = numero ) ,<br>

        dos    = sample( c(runif(numero) , rep(NA , numero /1e1<br>

        )) , size = numero ) ,<br>

        tres   = sample( c(runif(numero) , rep(NA , numero /2e1<br>

        )) , size = numero ) ,<br>

        cuatro = sample( c(runif(numero) , rep(NA , numero /1e2<br>

        )) , size = numero ) ,<br>

        cinco  = sample( c(runif(numero) , rep(NA , numero /2e2<br>

        )) , size = numero ) ,<br>

        seis   = sample( c(runif(numero) , rep(NA , numero /1e3<br>

        )) , size = numero )<br>

        )<br>

</span><span>      tmp <- (as.matrix(dat))<br>

      cols <- col(tmp)<br>

</span>      cols[<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(tmp)] <- NA<br>

      my.cols <- cols[,ncol(cols)]<br>

      for (j in (ncol(cols)-1):1){<br>

        my.cols <- ifelse(<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(cols[,j]), my.cols, cols[,j])<br>

<span>      }<br>

      my.values <- tmp[cbind(1:nrow(tmp), my.cols)]<br>

</span><span>    },<br>

    times = N, unit = "s")<br>

<br>

> tabla<br>

</span>Unit: seconds<br>

<span>        expr       min        lq      mean    median        uq       max<br>

neval<br>

</span>         JVG 1.0458327 1.3045354 1.3660296 1.3486868 1.4004353 2.0389759<br>

 10<br>

     Olivier 4.4031746 4.6501372 4.9638930 4.9841975 5.2855783 5.5569627<br>

 10<br>

    Olivier2 1.7937688 2.1531256 2.4749540 2.5052893 2.8389349 3.0933835<br>

 10<br>

      Adolfo 0.3520900 0.3615358 0.4764479 0.3942295 0.5072621 1.0266727<br>

 10<br>

    Olivier3 0.3936536 0.4454847 0.5254894 0.4784246 0.5269834 0.8900983<br>

 10<br>

 GilBellosta 0.2721629 0.3097020 0.3901691 0.3466332 0.4294069 0.7126116<br>

 10<br>

     Adolfo2 0.1110292 0.1611071 0.1812212 0.1639743 0.2007791 0.2948245<br>

 10<br>

</div></div><div class="m_-4634798612048657397HOEnZb"><div class="m_-4634798612048657397h5"><br>

        [[alternative HTML version deleted]]<br>

<br>

______________________________<wbr>_________________<br>

R-help-es mailing list<br>

<a href="mailto:R-help-es@r-project.org" target="_blank">R-help-es@r-project.org</a><br>

<a href="https://stat.ethz.ch/mailman/listinfo/r-help-es" rel="noreferrer" target="_blank">https://stat.ethz.ch/mailman/l<wbr>istinfo/r-help-es</a><br>

</div></div></blockquote></div><br></div>

</blockquote></div><br></div>

<br></body></html>