[R] Question Regarding 'pipe'

Jan T. Kim jtk at cmp.uea.ac.uk
Tue Apr 8 16:25:28 CEST 2008


On Tue, Apr 08, 2008 at 06:52:33AM -0500, born.to.b.wyld at gmail.com wrote:
> Reason, I need to do this in awk and not R:
> 
> Let's say 'x' is the tabular representation of a sparse contingency table
> > x
>    x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 Freq
> 1    0   0   0   2   2   2   0   2   0   2   54
> 5    1   0   0   2   2   2   0   2   1   2    1
> 6    1   0   1   0   2   2   0   2   0   2  137
> 7    1   0   1   1   2   1   0   1   0   1   14
> 10   1   0   1   1   2   2   0   2   1   2  110
> 
> 
> I need to generate that dataset from which such a contingency table could be
> made. That dataset would have sum(x$Freq) = 316 rows. Note that this is only
> a sample example. Real examples that I could be dealing with would have up
> to 50 variables and tens of thousands of cases. I can definitely make a
> dataset in R itself by replicating the first row 54 times, the third row 137
> times and so on (for a real example, each replication could be a 1000
> times), but I bet this is going to be as inefficient way of doing things as
> it could be.
> 
> However, if I write 'x' to file and then call the awk script above, I get my
> dataset within fraction of seconds (including the time to write and read
> files on the OS).
> 
> Any suggestions now on how to get the 'pipe' command work?

Escape the quotes that you want to include in the string by preceding them with
a backslash. Otherwise R will interpret them to terminate or start a string --
essentially as awk would do with double quotes just the same.

I'm still not convinced, though, that running that while loop using awk has
any advantages over programming it in R, but it's your choice...

Best regards, Jan


> On Tue, Apr 8, 2008 at 4:45 AM, Jan T. Kim <jtk at cmp.uea.ac.uk> wrote:
> 
> > On Mon, Apr 07, 2008 at 07:42:50PM -0500, born.to.b.wyld at gmail.com wrote:
> > > Can anyone point out why this is not working?
> > >
> > > y<-read.table(pipe('  awk '{ n = $1; sub( ".*" $1 " " ,"") ; while ( n--
> > )
> >                            ^
> > > print }'  temp.txt '))
> >          ^
> >
> > one problem is that your single-quoted R string contains single quotes
> > which I've pointed to with "^" above. You probably intend these single
> > quotes to be part of the awk command line, but R can\'t know that unless
> > you escape them...  ;-)
> >
> > In the future, can you please describe explicitly how it "is not working",
> > and also give a bit more context, such as a few lines of description of
> > the content of temp.txt, and why you're trying to use awk (rather than R
> > itself) to achieve whatever you're trying to achieve?
> >
> > Best regards, Jan
> > --
> >  +- Jan T. Kim -------------------------------------------------------+
> >  |             email: jtk at cmp.uea.ac.uk                               |
> >  |             WWW:   http://www.cmp.uea.ac.uk/people/jtk             |
> >  *-----=<  hierarchical systems are for files, not for humans  >=-----*
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
 +- Jan T. Kim -------------------------------------------------------+
 |             email: jtk at cmp.uea.ac.uk                               |
 |             WWW:   http://www.cmp.uea.ac.uk/people/jtk             |
 *-----=<  hierarchical systems are for files, not for humans  >=-----*



More information about the R-help mailing list