[R] Faster Printing Alternatives to 'cat'

Petr PIKAL petr.pikal at precheza.cz
Thu Jan 8 14:52:55 CET 2009


Hi

r-help-bounces at r-project.org napsal dne 08.01.2009 14:26:32:

> Dear Jim and Henrik,
> 
> > What exactly is the problem you are trying to solve.
> > Is it going to be read by some other program?
> 
> I  simply want to print the data out. Surely, this data
> will be manipulated (with Excel or other
> programming languages) by other people suit to their purpose.
> 
> Typically the print out from the loop looks  like this:
> 
> ATCGATCGATCGGGGGGGGGGGGGGGTTTGCGGG   10   11.992
> CCCCCCCCGGGCCATCGGTCAGGGAATTGACGGAA   2      0.222
> .....
> up to ~16 million lines.

Just curious. Can Excel manipulate 16 mil lines?

> 
> > How much physical memory do you have on your machine?
> 6GB
> 
> >  Is there paging  occuring due to the size of the objects?
> Don't quite understand what do you mean by that
> So sorry for my lack of knowledge in R.
> 
> >  Have you consider creating a  structure with 10,000 of the variables
> > each time through the loop and then writing them out?
> 
> Never thought about that. Can you be specific how can this be achieved?

declare an object
make a loop
fill an object inside loop
output this object into a file e.g. by write.table
start again with new part of your data

Regards
Petr

> 
> - Gundala Viswanath
> Jakarta - Indonesia
> 
> 
> 
> On Thu, Jan 8, 2009 at 10:10 PM, jim holtman <jholtman at gmail.com> wrote:
> > What exactly is the problem you are trying to solve.  What is going to
> > be done with the data?  Is it going to be read by some other program?
> > How much physical memory do you have on your machine?  Is there paging
> > occuring due to the size of the objects?  Have you consider creating a
> > structure with 10,000 of the variables each time through the loop and
> > then writing them out?  A lot will depend on how much free memory you
> > have.  I will also ask one of my favorite questions; "tell me what you
> > want to do, not how you want to do it".
> >
> > On Thu, Jan 8, 2009 at 6:12 AM, Gundala Viswanath <gundalav at gmail.com> 
wrote:
> >> Dear all,
> >>
> >> I found that printing with 'cat' is very slow.
> >>
> >> For example in my machine this snippet
> >>
> >> __BEGIN__
> >>
> >> # I need to resolve to use this type of loop.
> >> # because using write(), I need to create a matrix  which
> >> # consumes so much memory. Note that "foo, bar, qux" object
> >> # is already very large (>2Gb)
> >>
> >> for ( s in 1:length(x) ) {
> >>    cat(as.character(foo[s]),"\t",bar[s],"\t", qux[s],"\n")
> >> }
> >> __END__
> >>
> >> for "x" of size ~1.5million, takes more than 10 hours to print.
> >> On my Linux 1994.MHz AMD processor.
> >>
> >> Is there any faster alternatives to "cat" ?
> >>
> >>
> >> - Gundala Viswanath
> >> Jakarta - Indonesia
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem that you are trying to solve?
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list