[R] Split data frame into 250-row chunks
David Winsemius
dwinsemius at comcast.net
Wed Jun 10 21:33:22 CEST 2015
On Jun 10, 2015, at 12:18 PM, David Winsemius wrote:
>
> On Jun 10, 2015, at 5:39 AM, Liz Hare wrote:
>
>> Hi R-Experts,
>>
>> I have a data.frame like this:
>>
>>> head(map)
>> chr snp poscm posbp dist
>> 1 1 M1 2.99043 3249189 NA
>> 2 1 M2 3.06457 3273096 0.07414
>> 3 1 M3 3.17018 3307151 0.10561
>> 4 1 M4 3.20892 3319643 0.03874
>> 5 1 M5 3.28120 3342947 0.07228
>> 6 1 M6 3.29624 3347798 0.01504
>>
>> I need to split this into chunks of 250 rows (there will usually be a last chunk with < 250 rows).
>
> split( map, trunc( 0:(nrow(map)-1 )/nrow(map) ) )
>
> Untested. Designed to return a list with indices starting at "0".
Looking at Marc Schwartz' answer ( a smarter man than I) I see this should have been:
split( map, trunc( 0:(nrow(map)-1 )/250) )
--
David.
>
>> trunc( 0:19/5)
> [1] 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
>
>
>
>>
>> If I only had to extract one 250-line chunk, it would be easy:
>>
>> map1 <- map[1:250, ]
>>
>> or using subset().
>>
>> I tried to make it a loop iterating through num and using beg and nd for starting and ending indices, but I couldn’t figure out how to reference all the variables I needed in this:
>>
>>> chunks
>> beg nd let num
>> 1 1 250 a 1
>> 2 251 500 b 2
>> 3 501 750 c 3
>> 4 751 1000 d 4
>> 5 1001 1250 e 5
>> 6 1251 1500 f 6
>> 7 1501 1750 g 7
>> 8 1751 2000 h 8
>> 9 2001 2250 i 9
>> 10 2251 2500 j 10
>> …
>>
>> Remembering that loops are not always the best answer in R, I looked at other options like split, following this example but not being able to adapt it from a vector to a data.frame version
>> http://stackoverflow.com/questions/3318333/split-a-vector-into-chunks-in-r <http://stackoverflow.com/questions/3318333/split-a-vector-into-chunks-in-r> (Yes, I’ve reviewed the language documentation). I checked out ddply and data.table, but couldn’t find a way to use them with index positions instead of column values.
>>
>> Thanks,
>> Liz
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list