[R] Split data frame into 250-row chunks
David Winsemius
dwinsemius at comcast.net
Wed Jun 10 21:18:13 CEST 2015
On Jun 10, 2015, at 5:39 AM, Liz Hare wrote:
> Hi R-Experts,
>
> I have a data.frame like this:
>
>> head(map)
> chr snp poscm posbp dist
> 1 1 M1 2.99043 3249189 NA
> 2 1 M2 3.06457 3273096 0.07414
> 3 1 M3 3.17018 3307151 0.10561
> 4 1 M4 3.20892 3319643 0.03874
> 5 1 M5 3.28120 3342947 0.07228
> 6 1 M6 3.29624 3347798 0.01504
>
> I need to split this into chunks of 250 rows (there will usually be a last chunk with < 250 rows).
split( map, trunc( 0:(nrow(map)-1 )/nrow(map) ) )
Untested. Designed to return a list with indices starting at "0".
> trunc( 0:19/5)
[1] 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
>
> If I only had to extract one 250-line chunk, it would be easy:
>
> map1 <- map[1:250, ]
>
> or using subset().
>
> I tried to make it a loop iterating through num and using beg and nd for starting and ending indices, but I couldn’t figure out how to reference all the variables I needed in this:
>
>> chunks
> beg nd let num
> 1 1 250 a 1
> 2 251 500 b 2
> 3 501 750 c 3
> 4 751 1000 d 4
> 5 1001 1250 e 5
> 6 1251 1500 f 6
> 7 1501 1750 g 7
> 8 1751 2000 h 8
> 9 2001 2250 i 9
> 10 2251 2500 j 10
> …
>
> Remembering that loops are not always the best answer in R, I looked at other options like split, following this example but not being able to adapt it from a vector to a data.frame version
> http://stackoverflow.com/questions/3318333/split-a-vector-into-chunks-in-r <http://stackoverflow.com/questions/3318333/split-a-vector-into-chunks-in-r> (Yes, I’ve reviewed the language documentation). I checked out ddply and data.table, but couldn’t find a way to use them with index positions instead of column values.
>
> Thanks,
> Liz
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list