[R] trouble with read.table and colClasses='raw'

Peter Ehlers ehlers at ucalgary.ca
Thu Feb 11 21:19:51 CET 2010


Johan,

My apologies if you took my comments to be sarcastic; they were
certainly not meant to be. I have no desire to put you or anyone
down.

I see now that you want to somehow store data more 'efficiently',
presumably in order to be able to handle larger objects in RAM.

I doubt that storage.mode raw will help. Your post implied that
you had saved an object and couldn't read it back into the same
format in which you think it was saved. So, did you have 16Gb
object to save? And why wouldn't you use save()? It's just a
guess, but I think you may have a file of _character_ data that
you want to read into R where its storage mode should be 'raw'.
I don't know how to do that.

If the main purpose is to circumvent R's memory requirements,
then there have been plenty of posts on that issue.

  -Peter Ehlers

Johan Jackson wrote:
> "I suspect that you really don't know what 'raw' type means and haven't
> bothered to check ?raw. It's also pretty clear that you haven't read the
> colClasses description in ?read.table very carefully."
> 
> Gee, thanks Peter (this is what I love about the R help boards: people whose
> sole goal is to put others down as wittily as possible for asking *stupid
> stupid* questions). Gives me warm fuzzies :)
> 
> Although I admit to not being the brightest of folks around, or knowing R
> backwards and forwards, I did read ?read.table and ?raw. But your suggestion
> is not at all helpful Peter:
> 
> dat <- read.table(file="data", header=TRUE, colClasses="character") #wow! it
> works on a 5x3 matrix! amazing!! (sarcasm)
> 
> dat2 <- as.matrix(dat)
> storage.mode(dat2) <- 'raw'
> 
> if I had wanted 'character' data, I would have put that into my question.
> Any newbie can do what you did; the issue is that object.size(dat) is about
> 8 times larger than object.size(dat2) with any large dataset. That's why I
> want to store it as 'raw' - because the raw one takes about 2 Gb RAM and the
> other about 16Gb! Perhaps you need to understand the raw mode a bit better,
> Peter, because I thought the reason for wanting the data in 'raw' was quite
> obvious, but I guess not.
> 
> Peter, here's what I want you to do. Use R to make a vector with 2^31 - 5
> elements in it. Hey, make it of mode 'character' while you're at it! Write
> it out. Read it back in. Having problems? Then come talk to me...
> 
> JJ
> 
[....]

-- 
Peter Ehlers
University of Calgary



More information about the R-help mailing list