[R] read data into R with some constraints (Just out of curiosity)

Yves Gauvreau cyg at sympatico.ca
Thu Jan 11 22:16:06 CET 2001


Hi Brian

I'm not sure about this but could this kind of selective records reading be
done (at least under Windoze) using RODBC since there is a driver for ASCII
file sources?

Assuming the answer is Yes. I would then ask if RODBC could also be used to
do the same for a file residing on a Linux (Ext2fs) drive or any other file
system for that matter?

(I think not because RODBC is just an interface that call upon the existing
drivers on the system.)


Regards

Yves Gauvreau
B.E.F.P. Universite du Quebec a Montreal
cyg at sympatico.ca

> -----Message d'origine-----
> De : owner-r-help at stat.math.ethz.ch
> [mailto:owner-r-help at stat.math.ethz.ch]De la part de Prof Brian D Ripley
> Envoye : Thursday, January 11, 2001 1:45 PM
> A : Yu-Ling Wu
> Cc : R-help at stat.math.ethz.ch
> Objet : Re: [R] read data into R with some constraints
>
>
> On Thu, 11 Jan 2001, Yu-Ling Wu wrote:
>
> > Hi,
> >
> > I have a big data file (over 30,000 records) looks
> > like this:
> >
> > 100, 20, 46, 70
> > 103,  0, 22, 45
> > 117, -1, 34, 65
> > 120, 15,  0, 25
> > 113,  0,  -1, 32
> > 142, -1, -1, 55
> > .....
> >
> > I want to read only those records having positive
> > values in all of the  four
> > columns. That is, I don't want to read record # 3, 5,
> > and 6 into R. However,
> > when I type:
> >
> > read.csv("data.csv", sep=",")  -> rawdata
>
> Um, read.csv uses sep =",", and you need header=FALSE.
>
> > it reads the whole thing into R including those
> > records I don't want.
> > Could anyone tell me how I can read only those records
> > I want?
>
> You can't!  Until you have read the record, you cannot tell if all the
> entries are positive.
>
> Is this really a problem?  You only have around 120k numbers, and I just
> did it very easily.
>
> rawdata <- read.csv("data.csv", header=F)
>
> Perhaps better is to use a matrix and scan():
>
> rawdata <- matrix(scan("data.csv", sep=","), , 4, byrow=TRUE)
> keep <- (rawdata <= 0) %*% rep(1,4) == 0
> rawdata[keep, ]
>
> Takes a few seconds and a few Mb.
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272860 (secr)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> -.-.-.-.-.-.-
> r-help mailing list -- Read
http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list