[R] Obtaining data from a different row of data frame
    arun 
    smartpink111 at yahoo.com
       
    Sun Sep 22 03:18:47 CEST 2013
    
    
  
Hi Ira,
#Speed comparison 
set.seed(29)
 df2<- data.frame(Dates=seq(as.Date("2006-01-03"),length.out=2000,by="1 day"),cbind(matrix(sample(10:120,2000*300,replace=TRUE),ncol=300),matrix(sample(0:6,2000*300,replace=TRUE),ncol=300)))
 colnames(df2)[2:301]<- paste0("P",1:300)
 colnames(df2)[302:601]<- paste0("OF",1:300)
df2[2000,grep("OF",colnames(df2))]<-0
 df2[1995:1999,grep("OF",colnames(df2))]<-1
df3<- df2
df2[,602:901]<-NA
colnames(df2)[602:901]<- paste0("newPrice",1:300)
dim(df2)
#[1] 2000  901
system.time({
for(j in grep("^P",colnames(df2))) {
 df2[j+600] = df2[df2[,j+300] + row(df2)[,j], j]
 }
})
#  user  system elapsed 
# 12.292   0.256  12.576 
system.time({
indx1<- unlist(df3[,grep("OF",colnames(df3))],use.names=FALSE)
val1<- unlist(df3[,grep("P",colnames(df3))],use.names=FALSE)
 df3[,602:901]<- val1[indx1+seq_along(indx1)]
 colnames(df3)[602:901]<- colnames(df2)[602:901]
})
#  user  system elapsed 
#  0.600   0.016   0.616 
identical(df2,df3)
#[1] TRUE
A.K.
________________________________
From: Ira Sharenow <irasharenow100 at yahoo.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Saturday, September 21, 2013 5:09 PM
Subject: Re: [R] Obtaining data from a different row of data frame
Arun,
Thanks for helping me improve my question.
I made a slight change to the data, so that the coincidence will not occur.
Please note that eventually I will have to write a function for a user. Also after my rewrite, I have better code, but I do not know how to get rid of the loop. Also eventually I will need to generalize and take the average of r rows above and below the row that this algorithm is attempting to retrieve. For example if r = 5 and the row 1 offset gets me to row 100, I will eventually need the price in rows 95 through 105. Yes, I realize that I need to deal with a number of annoying details such as possibly winding up with row -3.
> df1 = data.frame(matrix(rep(NA, 10*7), nrow = 10))
> Dates =  as.Date(c("2006-01-03", "2006-01-04", "2006-01-05", "2006-01-06", "2006-01-09", "2006-01-10", "2006-01-11", 
+                    "2006-01-12", "2006-01-13", "2006-01-16"), format = "%Y-%m-%d")
> P1 = seq(from = 10, by = 3, length.out = 10)
> P2 = seq(from = 100, by = 2, length.out = 10)
> OF1 = c(3,3,4,5,2,2,2,1,1,0)
> OF2 = c(5,3,4,2,1,2,2,1,1,0)
> df1 = data.frame(Dates = Dates, P1 = P1, P2 = P2, OF1 = OF1, OF2 = OF2)
> df1$newPrice1 = rep(NA, 10)
> df1$newPrice2 = rep(NA, 10)
> 
> df1
        Dates P1  P2 OF1 OF2 newPrice1 newPrice2
1  2006-01-03 10 100   3   5        NA        NA
2  2006-01-04 13 102   3   3        NA        NA
3  2006-01-05 16 104   4   4        NA        NA
4  2006-01-06 19 106   5   2        NA        NA
5  2006-01-09 22 108   2   1        NA        NA
6  2006-01-10 25 110   2   2        NA        NA
7  2006-01-11 28 112   2   2        NA        NA
8  2006-01-12 31 114   1   1        NA        NA
9  2006-01-13 34 116   1   1        NA        NA
10 2006-01-16 37 118   0   0        NA        NA
> 
> for(j in 2:3) {
+ df1[j+4] = df1[df1[,j+2] + row(df1)[,j], j]
+ }
> df1
        Dates P1  P2 OF1 OF2 newPrice1 newPrice2
1  2006-01-03 10 100   3   5        19       110
2  2006-01-04 13 102   3   3        22       108
3  2006-01-05 16 104   4   4        28       112
4  2006-01-06 19 106   5   2        34       110
5  2006-01-09 22 108   2   1        28       110
6  2006-01-10 25 110   2   2        31       114
7  2006-01-11 28 112   2   2        34       116
8  2006-01-12 31 114   1   1        34       116
9  2006-01-13 34 116   1   1        37       118
10 2006-01-16 37 118   0   0        37       118
>  
> # Better code. Produces exactly the same results as above.
> for(j in 2:3) {
+ df1[j+4] = df1[df1[,j+2] + row(df1)[,j], j]
+ }
> df1 
On 9/21/2013 1:39 PM, arun wrote:
Hi,
Your example dataset could be confusing as: with(df1,P1+OF1)
# [1] 13 14 16 18 16 17 18 18 19 19
 with(df1,P2+OF2)
# [1] 105 104 106 105 105 107 108 108 109 109 which is the same as:
 df1$newPrice1
# [1] 13 14 16 18 16 17 18 18 19 19
 df1$newPrice2
# [1] 105 104 106 105 105 107 108 108 109 109 ----- Original Message -----
From: Ira Sharenow <irasharenow100 at yahoo.com> To: r-help at r-project.org Cc: 
Sent: Saturday, September 21, 2013 11:30 AM
Subject: [R] Obtaining data from a different row of data frame I have a large data frame with 2,000 rows and 600 columns. I can write 
loops to solve a smaller problem, but I need a better strategy for this 
data frame. Below is a simple example with just two stocks. In the data frame, each row represents a trading day. The first column 
is dates. The next group of columns represents the prices of the stocks 
on the specified dates. The next group of columns represents how many 
trading days I wish to offset. So if the first trading day is 2006-01-03 
and OF1 == 3, then I need to go to row 1+3 and get the price in column 
P1. The result is placed in row 1 of column 6. df1 = data.frame(matrix(rep(NA, 10*7), nrow = 10)) Dates =as.Date(c("2006-01-03", "2006-01-04", "2006-01-05", "2006-01-06", 
"2006-01-09", "2006-01-10", "2006-01-11", "2006-01-12", "2006-01-13", "2006-01-16"), format = "%Y-%m-%d") P1 = 10:19 P2 = 100:109 OF1 = c(3,3,4,5,2,2,2,1,1,0) OF2 = c(5,3,4,2,1,2,2,1,1,0) df1 = data.frame(Dates = Dates, P1 = P1, P2 = P2, OF1 = OF1, OF2 = OF2) df1$newPrice1 = rep(NA, 10) df1$newPrice2 = rep(NA, 10) for(j in 6:7) { for(i in 1:10 ) { rowNumber = i + df1[i,j-2] #print(rowNumber) df1[i,j] = df1[rowNumber, j-4] } # end i loop } # end j loop df1 
>df1 
>Dates P1P2 OF1 OF2 newPrice1 newPrice2 1 2006-01-03 10 1003513105 22006-01-04 11 1013314104 32006-01-05 12 1024416106 42006-01-06 13 1035218105 52006-01-09 14 1042116105 62006-01-10 15 1052217107 72006-01-11 16 1062218108 82006-01-12 17 1071118108 92006-01-13 18 1081119109 10 2006-01-16 19 1090019109     [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.        
    
    
More information about the R-help
mailing list