[R] Help processing large data

jim holtman jholtman at gmail.com
Sat Nov 29 02:15:36 CET 2008


Is this what you want:

> x <- read.table(textConnection('"read" "no" "length"
+ 2 2 144
+ 7 7 47490
+ 9 9 310944
+ 11 11 10089
+ 14 14 13152
+ 17 17 27363 '), header=TRUE)
> closeAllConnections()
> result <- lapply(1:nrow(x), function(.indx){
+     data.frame(read=paste(x$read[.indx], seq(x$length[.indx] %/% 100
+ 1), sep="_"),
+         no=rep(x$no[.indx], x$length[.indx] %/% 100 + 1),
+         length=c(rep(100, x$length[.indx] %/% 100), x$length[.indx] %% 100))
+ })
> result <- do.call(rbind, result)
>
> str(result)
'data.frame':   4094 obs. of  3 variables:
 $ read  : Factor w/ 4094 levels "2_1","2_2","7_1",..: 1 2 3 114 225
336 423 434 445 456 ...
 $ no    : int  2 2 7 7 7 7 7 7 7 7 ...
 $ length: num  100 44 100 100 100 100 100 100 100 100 ...
> head(result)
  read no length
1  2_1  2    100
2  2_2  2     44
3  7_1  7    100
4  7_2  7    100
5  7_3  7    100
6  7_4  7    100
>


On Thu, Nov 27, 2008 at 5:16 AM, mitras <suparna.mitra at gmail.com> wrote:
>
> Dear all,
>  I have one problem to handle a large dataset...
> It looks like:
> "read" "no" "length"
> 2 2 144
> 7 7 47490
> 9 9 310944
> 11 11 10089
> 14 14 13152
> 17 17 27363 and so on
> There are 130000 rows
>
> >From this table I need to make a table like
> 2_1 2 100
> 2_2 2 44
> 7_1 7 100
> 7_2 7 100
> ...
> ...
> 7_474 7 100
> 7_475 7 90
> 9_1 9 100
> 9_2 9 100 and so on...
>
> In words: I want to divide the 3rd column by 100  to keep the length 100 and
> increasing no of rows needed, where no will be same for all increased rows,
> but the read will be changed like 2_1,2_2 and so on..
> Please let me know if any one can help.
> Thanks a lot in advance.
> Best,
> Mitra.
> --
> View this message in context: http://www.nabble.com/Help-processing-large-data-tp20716564p20716564.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list