[Rd] Objects created by more than one data call?
William Dunlap
wdunlap at tibco.com
Wed May 22 23:12:32 CEST 2013
I used svn to copy the current version of Ecdat from Rforge to my PC
C:\temp\packages>svn checkout svn://r-forge.r-project.org/svnroot/ecdat/
then fired up R to look at the rda files in it.
> setwd("c:/temp/packages/Ecdat/ecdat/pkg/data")
> read.dcf("../DESCRIPTION")[, c("Package","Version")]
Package Version
"Ecdat" "0.2-3"
> dir.rda <- function(rdaFile) { e <- new.env() ; load(rdaFile, envir=e) ; objects(e, all=TRUE)}
> dir.rda("VietNamH.rda")
[1] "MedExp"
> rdas <- dir(pattern="\\.rda$")
> names(rdas) <- rdas
> z <- lapply(rdas, dir.rda)
> tab <- table(unlist(z))
> tab[tab>1]
Hstarts MedExp
3 2
> z[sapply(z, function(zi)"Hstarts" %in% zi)]
$Hstarts.rda
[1] "Hstarts"
$Intratesm.rda
[1] "Hstarts"
$Intratesq.rda
[1] "Hstarts"
> z[sapply(z, function(zi)"MedExp" %in% zi)]
$MedExp.rda
[1] "MedExp"
$VietNamH.rda
[1] "MedExp"
It looks some files don't contain what their names suggest:
> dir.rda("VietNamH.rda")
[1] "MedExp"
The two versions of MedExp are quite different:
> load("VietNamH.rda", envViet <- new.env(parent=emptyenv()))
> load("MedExp.rda", envMed <- new.env(parent=emptyenv()))
> objects(envViet)
[1] "MedExp"
> objects(envMed)
[1] "MedExp"
> all.equal(envViet$MedExp, envMed$MedExp)
[1] "Names: 11 string mismatches"
[2] "Length mismatch: comparison on first 11 components"
[3] "Component 1: 'current' is not a factor"
...
[18] "Component 10: Numeric: lengths (5999, 5574) differ"
[19] "Component 11: 'current' is not a factor"
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: Spencer Graves [mailto:spencer.graves at prodsyse.com]
> Sent: Wednesday, May 22, 2013 1:27 PM
> To: William Dunlap
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] Objects created by more than one data call?
>
> On 5/21/2013 3:03 PM, William Dunlap wrote:
> > If you look at
> > data(package="Ecat")$results[,"Item"]
> > you will see the items "Hstarts", "Hstarts (Intratesm)", and "Hstarts (Intratesq)"
> > which I think means that the dataset Hstarts is found in 3 .rda files, "Hstarts.rda",
> > "Intratesq.rda", and "Intratesm.rda". There are duplicate, modulo (filename),
> > items for "MedExp" as well.
>
>
> Thanks for this. I may get me closer, but I still don't see it:
> (data(Intratesm)) imports only the object Intratesm, etc. For more
> details, see below.
>
>
> Any other suggestions?
>
>
> Thanks again,
> Spencer
>
>
> > Ecdat.data <- data(package="Ecdat")$results
> > (Hstarts2 <- grep('Hstarts', Ecdat.data[, 'Item']))
> [1] 47 48 49
> > (MedExp2 <- grep('MedExp', Ecdat.data[, 'Item']))
> [1] 67 68
> > Ecdat.data[Hstarts2, ]
> Package LibPath Item
> [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts"
> [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesm)"
> [3,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesq)"
> Title
> [1,] "Housing Starts"
> [2,] "Housing Starts"
> [3,] "Housing Starts"
> > Ecdat.data[MedExp2,]
> Package LibPath Item
> [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp"
> [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp (VietNamH)"
> Title
> [1,] "Structure of Demand for Medical Care"
> [2,] "Structure of Demand for Medical Care"
> > library(Ecdat)
> > (data(Intratesm))
> [1] "Intratesm"
> > (data(Intratesq))
> [1] "Intratesq"
>
>
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >
> >> -----Original Message-----
> >> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On
> Behalf
> >> Of Spencer Graves
> >> Sent: Tuesday, May 21, 2013 12:21 PM
> >> To: Prof Brian Ripley
> >> Cc: r-devel at r-project.org
> >> Subject: Re: [Rd] Objects created by more than one data call?
> >>
> >> On 5/21/2013 9:03 AM, Prof Brian Ripley wrote:
> >>> On 21/05/2013 16:51, Spencer Graves wrote:
> >>>> On 5/21/2013 7:47 AM, Prof Brian Ripley wrote:
> >>>>> On 21/05/2013 15:28, Spencer Graves wrote:
> >>>>>> On 5/20/2013 10:10 PM, Prof Brian Ripley wrote:
> >>>>>>> On 21/05/2013 00:12, Spencer Graves wrote:
> >>>>>>>> Hello, All:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> If I use LazyData with the Ecdat package on R-Forge, "R CMD
> >>>>>>>> check" reports "no visible binding for global variable
> >>>>>>>> 'nonEnglishNames'", where 'nonEnglishNames' is a dataset in Ecdat
> >>>>>>>> used
> >>>>>>>> as the default argument for a function. With LazyData, that NOTE
> >>>>>>>> disappears. However, then I get, "Warning: objects 'Hstarts',
> >>>>>>>> 'Hstarts', 'MedExp' are created by more than one data call".
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> What do you suggest I do to fix this problem?
> >>>>>>> Not create the objects in more than one data() call.
> >>>>>>>
> >>>>>>> Check what each of your data() calls produces.
> >>>>>>
> >>>>>> Thanks. How do I do that?
> >>>>> Call data() on each in turn, and see what files get added to an empty
> >>>>> workspace.
> >>>>
> >>>> Like the following?
> >>> You missed the 'empty'. Look at tools:::data2LazyLoadDB to see how
> >>> this is checked.
> >>
> >> Thanks for the suggestion. Unfortunately, I tried that function,
> >> including stepping through it line by line, fixing references to other
> >> functions not exported from tools, without enlightenment; see below.
> >>
> >>
> >> Thanks again,
> >> Spencer
> >>
> >>
> >> > lib.loc = NULL
> >> > package='Ecdat'
> >> > pkgpath <- find.package(package, lib.loc, quiet = TRUE)
> >> > pkgpath
> >> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat"
> >> > dataDir <- file.path(pkgpath, "data")
> >> > dataDir
> >> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data"
> >> > enc <- tools:::.read_description(file.path(pkgpath,
> >> "DESCRIPTION"))["Encoding"]
> >> > enc
> >> <NA>
> >> NA
> >> > if (!is.na(enc)) {
> >> + op <- options(encoding = enc)
> >> + on.exit(options(encoding = op[[1L]]))
> >> + }
> >> > file_test("-d", dataDir)
> >> [1] TRUE
> >> > file.path(dataDir, "Rdata.rds")
> >> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data/Rdata.rds"
> >> > (file.exists(file.path(dataDir, "Rdata.rds")) &&
> >> file.exists(file.path(dataDir,
> >> + paste(package, "rdx", sep = "."))) &&
> >> file.exists(file.path(dataDir,
> >> + paste(package, "rdb", sep = "."))))
> >> [1] FALSE
> >> > file.exists(file.path(dataDir,
> >> + paste(package, "rdx", sep = ".")))
> >> [1] FALSE
> >> > file.path(dataDir,
> >> + paste(package, "rdx", sep = "."))
> >> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data/Ecdat.rdx"
> >> > dataEnv <- new.env(hash = TRUE)
> >> > tmpEnv <- new.env()
> >> > f0 <- files <- list_files_with_type(dataDir, "data")
> >> Error: could not find function "list_files_with_type"
> >> > f0 <- files <- tools:::list_files_with_type(dataDir, "data")
> >> > files <- unique(basename(file_path_sans_ext(files,
> >> + TRUE)))
> >> Error in basename(file_path_sans_ext(files, TRUE)) :
> >> could not find function "file_path_sans_ext"
> >> > files <- unique(basename(tools:::file_path_sans_ext(files,
> >> + TRUE)))
> >> > dlist <- vector("list", length(files))
> >> > files
> >> character(0)
> >> > names(dlist) <- files
> >> > loaded <- character(0L)
> >> > loaded
> >> character(0)
> >> > for (f in files) {
> >> + utils::data(list = f, package = package, lib.loc =
> >> lib.loc,
> >> + envir = dataEnv)
> >> + utils::data(list = f, package = package, lib.loc =
> >> lib.loc,
> >> + envir = tmpEnv)
> >> + tmp <- ls(envir = tmpEnv, all.names = TRUE)
> >> + rm(list = tmp, envir = tmpEnv)
> >> + dlist[[f]] <- tmp
> >> + loaded <- c(loaded, tmp)
> >> + }
> >> > dup <- duplicated(loaded)
> >> > dup
> >> logical(0)
> >> > if (any(dup))
> >> + warning(sprintf(ngettext(sum(dup), "object %s is
> >> created by more than one data call",
> >> + "objects %s are created by more than one data call"),
> >> + paste(sQuote(loaded[dup]), collapse = ", ")),
> >> + call. = FALSE, domain = NA)
> >> > if (length(loaded)) {
> >> + dbbase <- file.path(dataDir, "Rdata")
> >> + makeLazyLoadDB(dataEnv, dbbase, compress = compress)
> >> + saveRDS(dlist, file.path(dataDir, "Rdata.rds"),
> >> + compress = compress)
> >> + unlink(f0)
> >> + if (file.exists(file.path(dataDir, "filelist")))
> >> + unlink(file.path(dataDir, c("filelist", "Rdata.zip")))
> >> + }
> >> >
> >>
> >>>>
> >>>> > library(Ecdat)
> >>>> > objects()
> >>>> character(0)
> >>>> > (data(Hstarts))
> >>>> [1] "Hstarts"
> >>>> > (data(MedExp))
> >>>> [1] "MedExp"
> >>>> > objects()
> >>>> [1] "Hstarts" "MedExp"
> >>>> > sessionInfo()
> >>>> R version 3.0.0 (2013-04-03)
> >>>> Platform: i386-w64-mingw32/i386 (32-bit)
> >>>>
> >>>> locale:
> >>>> [1] LC_COLLATE=English_United States.1252
> >>>> [2] LC_CTYPE=English_United States.1252
> >>>> [3] LC_MONETARY=English_United States.1252
> >>>> [4] LC_NUMERIC=C
> >>>> [5] LC_TIME=English_United States.1252
> >>>>
> >>>> attached base packages:
> >>>> [1] stats graphics grDevices utils datasets methods base
> >>>>
> >>>> other attached packages:
> >>>> [1] Ecdat_0.2-3
> >>>>
> >>>> loaded via a namespace (and not attached):
> >>>> [1] tools_3.0.0
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Spencer
> >>>>
> >>>>>> In the "man" directory, I just did "grep 'data(MedExp' *.Rd",
> >>>>>> which identified only "MedExp.Rd:\usage{data(MedExp)}"; "grep
> >>>>>> 'data(Hstarts *.Rd" similarly returned only
> >>>>>> "Hstarts.Rd:\usage(data(Hstarts)}".
> >>>>>>
> >>>>>>
> >>>>>> Thanks again for the reply.
> >>>>>> Spencer
> >>>>>>>> Thanks,
> >>>>>>>> Spencer Graves
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> > sessionInfo()
> >>>>>>>> R version 3.0.0 (2013-04-03)
> >>>>>>>> Platform: i386-w64-mingw32/i386 (32-bit)
> >>>>>>>>
> >>>>>>>> locale:
> >>>>>>>> [1] LC_COLLATE=English_United States.1252
> >>>>>>>> [2] LC_CTYPE=English_United States.1252
> >>>>>>>> [3] LC_MONETARY=English_United States.1252
> >>>>>>>> [4] LC_NUMERIC=C
> >>>>>>>> [5] LC_TIME=English_United States.1252
> >>>>>>>>
> >>>>>>>> attached base packages:
> >>>>>>>> [1] stats graphics grDevices utils datasets methods base
> >>>>>>>>
> >>>>>>>> other attached packages:
> >>>>>>>> [1] Ecdat_0.2-3
> >>>>>>>>
> >>>>>>>> loaded via a namespace (and not attached):
> >>>>>>>> [1] tools_3.0.0
> >>>>>>>>
> >>>>>>>> ______________________________________________
> >>>>>>>> R-devel at r-project.org mailing list
> >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list