[R] sbusetting data by rows (every 69 rows)

arun smartpink111 at yahoo.com
Tue Aug 27 03:12:49 CEST 2013


Hi R.L.
No problem.

Try this:
#slightly modified the example:
set.seed(24)
 dat1<- as.data.frame(matrix(sample(c(1:10,Inf,-Inf),2000*3,replace=TRUE),ncol=3))
  lst1<-split(dat1,((seq_len(nrow(dat1))-1)%/%69)+1)
 lst2<-lapply(lst1,function(x) {colnames(x)<-letters[1:3];x}) 
lst3<-lapply(lst2,function(x) {x[x==Inf|x==-Inf]<-0;x})
tail(lst2[[29]],7)
#        a   b  c
#1994 -Inf Inf  5
#1995   10   5  7
#1996    5   5  5
#1997    6 Inf 10
#1998   10   7  4
#1999   10 Inf  1
#2000  Inf   8  2
 tail(lst3[[29]],7)
#      a b  c
#1994  0 0  5
#1995 10 5  7
#1996  5 5  5
#1997  6 0 10
#1998 10 7  4
#1999 10 0  1
#2000  0 8  2
A.K.





Thank you AK! 
So the final result 'res' is a list consisting of data frames! 
Actually my initial idea is to replace infinite values with zero in the 
list, and that was why i wished to convert the elements to data frame so
 that I can proceed the replacement using the following code 
list [ list == Inf ] = 0 
list [ list == -Inf ] = 0 
However, even the elements of the list are now data frame, I still 
received error message : '(list) object cannot be coerced to type 
'double' ' 
Does it mean I need to convert the whole list to data frame rather 
than each element? If so, how could I do so? ( I tried 
'as.data.frame(list)' which didn't work since the last element had less 
rows) 

Your contribution is highly appreciated! 

Best regards, 
R.L. 


----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Monday, August 26, 2013 10:46 AM
Subject: Re: sbusetting data by rows (every 69 rows)

Hi R.L.,

No problem.

You may try:
set.seed(24)
 dat1<- as.data.frame(matrix(sample(1:10,2000*3,replace=TRUE),ncol=3))
  lst1<-split(dat1,((seq_len(nrow(dat1))-1)%/%69)+1) 
 lst2<-lapply(lst1,function(x) {colnames(x)<-letters[1:3];x})
res<-lapply(lst2,function(x) {x$z<-with(x,(a-b)/c);x})
head(res[[1]],3)
#  a b c        z
#1 3 1 8 0.250000
#2 3 2 2 0.500000
#3 8 3 3 1.666667

A.K.



Thank you very much for your help AK.  The codes work efficiently! 

Just a following up question -- do you happen to know how to 
select certain columns in each element (since I need to apply 
calculation on multiple columns for each element of the list)? 
For example, list[1] looks like: 
$`1` 
        a       b        c 
1    2.1    1.4    3.4 
2    4.4    2.6    5.5 
3    2.6    0.4    3.0 
... 

$`2` 
          a       b        c 
70    5.1    4.9    5.1 
71    4.4    7.6    8.5 
72    2.8    3.5    6.8 
... 

what I wish to do is something like 
z = (a-b) / c 
for each element ($`1`,$`2`...) 

I tried the following code: 
for( i in 1:23) {                                             ## 
there are 23 elements in the list ( sorry in fact I have 1566 rows in 
total in sample) 
z = (list[[i]]$a - list[[i]]$b) / list[[i]]$c 
} 
which gave me only 49 values, rather than 1566 values. 

Thank you very much! 

Kind regards, 
R.L 


----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Sunday, August 25, 2013 1:28 PM
Subject: Re: sbusetting data by rows (every 69 rows)

#or you could try:

 lst2<- split(dat1,as.numeric(gl(69,69,2000)))
# identical(lst1,lst2)
#[1] TRUE
A.K.




----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Sunday, August 25, 2013 1:17 PM
Subject: Re: sbusetting data by rows (every 69 rows)



Hi,
Try:
set.seed(24)
dat1<- as.data.frame(matrix(sample(1:400,2000*16,replace=TRUE),ncol=16))
 lst1<-split(dat1,((seq_len(nrow(dat1))-1)%/%69)+1)
sapply(lst1,function(x) range(as.numeric(row.names(x))))
#    1   2   3   4   5   6   7   8   9  10  11  12  13  14   15   16   17   18
#[1,]  1  70 139 208 277 346 415 484 553 622 691 760 829 898  967 1036 1105 1174
#[2,] 69 138 207 276 345 414 483 552 621 690 759 828 897 966 1035 1104 1173 1242
#       19   20   21   22   23   24   25   26   27   28   29
#[1,] 1243 1312 1381 1450 1519 1588 1657 1726 1795 1864 1933
#[2,] 1311 1380 1449 1518 1587 1656 1725 1794 1863 1932 2000
 str(lst1[[1]])
#'data.frame':    69 obs. of  16 variables:
# $ V1 : int  118 90 282 208 266 369 112 306 321 102 ...
# $ V2 : int  6 50 115 247 355 109 39 297 35 209 ...
# $ V3 : int  313 67 102 298 367 23 376 91 5 38 ...
# $ V4 : int  207 351 212 342 255 399 239 57 234 79 ...
# $ V5 : int  74 80 289 165 231 193 310 255 98 218 ...
# $ V6 : int  99 91 325 143 398 66 201 337 66 382 ...
# $ V7 : int  339 327 325 274 22 105 106 75 400 167 ...
# $ V8 : int  135 233 91 306 230 140 233 166 210 351 ...
# $ V9 : int  204 203 256 337 25 295 214 288 63 388 ...
# $ V10: int  370 328 161 227 381 164 300 313 303 375 ...
# $ V11: int  171 373 133 345 60 119 215 48 55 367 ...
# $ V12: int  118 309 67 250 286 127 171 248 46 20 ...
# $ V13: int  385 15 282 276 130 166 160 214 58 74 ...
# $ V14: int  90 165 39 154 294 84 106 367 359 145 ...
# $ V15: int  392 290 103 14 111 148 200 331 302 88 ...
# $ V16: int  323 210 167 345 249 325 217 171 150 223 ...
 sapply(lst1,nrow)
# 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
#69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 69 
#27 28 29 
#69 69 68 


A.K.



Hi There, 

It might be a simple problem but I didn't find a clear solution online. 
The task is quite straightforward -- I have a large data frame with 
more than 2000 rows and 16 columns. For further analysis, I need to 
subset every 69 rows into some new data frames. 

I tried to used the "for" command (code as showing below): 

n = nrow(data) 
w = 69 
for(i in 1:(n-w)){ 
  data= data[i:(i+w),] 
  } 

But it only gave me a subset with the last 69 rows. 
So my question is now how to subset the whole data frame with every 69 rows ( 1st to 69th rows, 70th to 139th rows, etc.). 

Any help will be appreciated.



More information about the R-help mailing list