[R] read data into R with some constraints (Just out of curiosity)

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Jan 11 22:13:32 CET 2001


On Thu, 11 Jan 2001, Yves Gauvreau wrote:

> Hi Brian
> 
> I'm not sure about this but could this kind of selective records reading be
> done (at least under Windoze) using RODBC since there is a driver for ASCII
> file sources?

Yes, it could.

> Assuming the answer is Yes. I would then ask if RODBC could also be used to
> do the same for a file residing on a Linux (Ext2fs) drive or any other file
> system for that matter?

Don't think so.

Had the request been 100x larger, I would have suggested that.

> 
> (I think not because RODBC is just an interface that call upon the existing
> drivers on the system.)
> 
> 
> Regards
> 
> Yves Gauvreau
> B.E.F.P. Universite du Quebec a Montreal
> cyg at sympatico.ca
> 
> > -----Message d'origine-----
> > De : owner-r-help at stat.math.ethz.ch
> > [mailto:owner-r-help at stat.math.ethz.ch]De la part de Prof Brian D Ripley
> > Envoye : Thursday, January 11, 2001 1:45 PM
> > A : Yu-Ling Wu
> > Cc : R-help at stat.math.ethz.ch
> > Objet : Re: [R] read data into R with some constraints
> >
> >
> > On Thu, 11 Jan 2001, Yu-Ling Wu wrote:
> >
> > > Hi,
> > >
> > > I have a big data file (over 30,000 records) looks
> > > like this:
> > >
> > > 100, 20, 46, 70
> > > 103,  0, 22, 45
> > > 117, -1, 34, 65
> > > 120, 15,  0, 25
> > > 113,  0,  -1, 32
> > > 142, -1, -1, 55
> > > .....
> > >
> > > I want to read only those records having positive
> > > values in all of the  four
> > > columns. That is, I don't want to read record # 3, 5,
> > > and 6 into R. However,
> > > when I type:
> > >
> > > read.csv("data.csv", sep=",")  -> rawdata
> >
> > Um, read.csv uses sep =",", and you need header=FALSE.
> >
> > > it reads the whole thing into R including those
> > > records I don't want.
> > > Could anyone tell me how I can read only those records
> > > I want?
> >
> > You can't!  Until you have read the record, you cannot tell if all the
> > entries are positive.
> >
> > Is this really a problem?  You only have around 120k numbers, and I just
> > did it very easily.
> >
> > rawdata <- read.csv("data.csv", header=F)
> >
> > Perhaps better is to use a matrix and scan():
> >
> > rawdata <- matrix(scan("data.csv", sep=","), , 4, byrow=TRUE)
> > keep <- (rawdata <= 0) %*% rep(1,4) == 0
> > rawdata[keep, ]
> >
> > Takes a few seconds and a few Mb.
> >
> > --
> > Brian D. Ripley,                  ripley at stats.ox.ac.uk
> > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford,             Tel:  +44 1865 272861 (self)
> > 1 South Parks Road,                     +44 1865 272860 (secr)
> > Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> >
> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> > -.-.-.-.-.-.-
> > r-help mailing list -- Read
> http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> _._
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list