[R] What is the fastest way to see what are in an RData file?
Henrik Bengtsson
hb at stat.berkeley.edu
Sat Dec 19 10:39:54 CET 2009
library("R.oo");
setMethodS3("ll", "character", function(pathname, ..., force=FALSE) {
require("R.cache") || throw("Package not loaded: R.cache");
# Argument 'pathname':
pathname <- Arguments$getReadablePathname(pathname, mustExist=TRUE);
# Check for cache results
fi <- file.info(pathname);
rownames(fi) <- NULL;
key <- list(fileInfo=fi, checksum=digest::digest(pathname, file=TRUE));
res <- loadCache(key=key);
if (!force && !is.null(res)) {
return(res);
}
# Load the actual data
env <- loadToEnv(pathname);
# List its content
res <- ll(..., envir=env);
# Clean up
rm(env);
# Save to cache
saveCache(res, key=key);
res;
})
example(iris);
save.image("foo.RData");
ll("foo.RData");
member data.class dimension objectSize
1 dni3 list 3 392
2 ii data.frame c(150,5) 6424
3 ll.character function NULL 6948
/H
On Sat, Dec 19, 2009 at 12:14 AM, Patrick Connolly
<p_connolly at slingshot.co.nz> wrote:
> On Sat, 19-Dec-2009 at 09:24AM +1800, Peng Yu wrote:
>
> |> On Sat, Dec 19, 2009 at 3:35 AM, Patrick Connolly
> |> <p_connolly at slingshot.co.nz> wrote:
> |> > On Thu, 17-Dec-2009 at 03:13PM +1800, Peng Yu wrote:
> |> >
> |> > |> Currently, I load the RData file then ls() and str(). But loading the file
> |> > |> takes too long if the file is big. Most of the time, I only interested what
> |> > |> the variables are in the the file and the attributes of the variables (like
> |> > |> if it is a data.frame, matrix, what are the colnames/rownames, etc.)
> |> > |>
> |> > |> I'm wondering if there is any facility in R to help me avoid loading the
> |> > |> whole file.
> |> >
> |> >
> |> > I have a pretty nifty way of seeing what's in such a file, but I still
> |> > have to load all of the binary file before I can do so. If it's
> |> > taking you such a long time, maybe you could keep a larger number of
> |> > smaller RData files.
> |>
> |> What is your 'nifty way'?
>
> Well, it's nifty to may way of thinking. It's not particularly nifty
> in how I wrote it, but what it achieves is pretty nifty. I know
> thousands would disagree, but I think this is nifty:
>
> Object Mode Rows Cols Len Date
> 1 fix.bill function -- -- 1 22/12/2008
> 2 fix.bill2 function -- -- 1 22/12/2008
> 3 aa dataframe 9 1 1 21/12/2008
> 4 bb dataframe 9 6 6 21/12/2008
> 5 bill.lines dataframe 759 1 1 21/12/2008
> 6 aftertax5 function -- -- 1 31/08/2008
> 7 aftertax6 function -- -- 1 31/08/2008
> 8 cont.cgt6 function -- -- 1 31/08/2008
> 9 aftertax5BW function -- -- 1 29/08/2008
> 10 cont.cgt function -- -- 1 29/08/2008
> 11 summ.df dataframe 2 5 5 29/08/2008
>
>
> Part of the reason why it's not that well-written is because it was
> first written for SPlus which works rather differently from R in
> important respects, and at the time I knew quite a lot less than I
> know now. It's not exactly in publishable form since it uses a number
> of my local (inelegant) functions and it would not work on Windows
> without some fundamental rewriting. I did the minimum to port if from
> Splus and gave no thought whatever to having it work on Windows.
>
> On two occasions I offered to supply my code to anyone who wanted
> something to start with to make something publishable, but hardly
> anybody was impressed with what I thought was nifty. (I never made it
> as a singer-songwriter either.) :-)
>
> |> How fast is it?
>
> Practically no time (milliseconds) to run the function, but the binary
> file must be fully loaded, and that's what you're trying to avoid, so
> it wouldn't fit your purpose.
>
>
> --
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
> ___ Patrick Connolly
> {~._.~} Great minds discuss ideas
> _( Y )_ Average minds discuss events
> (:_~*~_:) Small minds discuss people
> (_)-(_) ..... Eleanor Roosevelt
>
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list