[Rd] Objects created by more than one data call?
Spencer Graves
spencer.graves at prodsyse.com
Wed May 22 23:40:50 CEST 2013
Dear Bill:
On 5/22/2013 2:12 PM, William Dunlap wrote:
> I used svn to copy the current version of Ecdat from Rforge to my PC
> C:\temp\packages>svn checkout svn://r-forge.r-project.org/svnroot/ecdat/
> then fired up R to look at the rda files in it.
>
>> setwd("c:/temp/packages/Ecdat/ecdat/pkg/data")
>> read.dcf("../DESCRIPTION")[, c("Package","Version")]
> Package Version
> "Ecdat" "0.2-3"
>> dir.rda <- function(rdaFile) { e <- new.env() ; load(rdaFile, envir=e) ; objects(e, all=TRUE)}
>> dir.rda("VietNamH.rda")
> [1] "MedExp"
>> rdas <- dir(pattern="\\.rda$")
>> names(rdas) <- rdas
>> z <- lapply(rdas, dir.rda)
>> tab <- table(unlist(z))
>> tab[tab>1]
> Hstarts MedExp
> 3 2
>> z[sapply(z, function(zi)"Hstarts" %in% zi)]
> $Hstarts.rda
> [1] "Hstarts"
>
> $Intratesm.rda
> [1] "Hstarts"
>
> $Intratesq.rda
> [1] "Hstarts"
>
>> z[sapply(z, function(zi)"MedExp" %in% zi)]
> $MedExp.rda
> [1] "MedExp"
>
> $VietNamH.rda
> [1] "MedExp"
>
> It looks some files don't contain what their names suggest:
>> dir.rda("VietNamH.rda")
> [1] "MedExp"
>
> The two versions of MedExp are quite different:
>> load("VietNamH.rda", envViet <- new.env(parent=emptyenv()))
>> load("MedExp.rda", envMed <- new.env(parent=emptyenv()))
>> objects(envViet)
> [1] "MedExp"
>> objects(envMed)
> [1] "MedExp"
>> all.equal(envViet$MedExp, envMed$MedExp)
> [1] "Names: 11 string mismatches"
> [2] "Length mismatch: comparison on first 11 components"
> [3] "Component 1: 'current' is not a factor"
> ...
> [18] "Component 10: Numeric: lengths (5999, 5574) differ"
> [19] "Component 11: 'current' is not a factor"
Thanks very much. Now I understand the problem. For a solution,
I need to consult with the package author (Yves Croissant; I'm only the
maintainer).
However, your work at least makes it easy for me to describe the
problem to him.
Thanks again.
Spencer
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: Spencer Graves [mailto:spencer.graves at prodsyse.com]
>> Sent: Wednesday, May 22, 2013 1:27 PM
>> To: William Dunlap
>> Cc: r-devel at r-project.org
>> Subject: Re: [Rd] Objects created by more than one data call?
>>
>> On 5/21/2013 3:03 PM, William Dunlap wrote:
>>> If you look at
>>> data(package="Ecat")$results[,"Item"]
>>> you will see the items "Hstarts", "Hstarts (Intratesm)", and "Hstarts (Intratesq)"
>>> which I think means that the dataset Hstarts is found in 3 .rda files, "Hstarts.rda",
>>> "Intratesq.rda", and "Intratesm.rda". There are duplicate, modulo (filename),
>>> items for "MedExp" as well.
>>
>> Thanks for this. I may get me closer, but I still don't see it:
>> (data(Intratesm)) imports only the object Intratesm, etc. For more
>> details, see below.
>>
>>
>> Any other suggestions?
>>
>>
>> Thanks again,
>> Spencer
>>
>>
>> > Ecdat.data <- data(package="Ecdat")$results
>> > (Hstarts2 <- grep('Hstarts', Ecdat.data[, 'Item']))
>> [1] 47 48 49
>> > (MedExp2 <- grep('MedExp', Ecdat.data[, 'Item']))
>> [1] 67 68
>> > Ecdat.data[Hstarts2, ]
>> Package LibPath Item
>> [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts"
>> [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesm)"
>> [3,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesq)"
>> Title
>> [1,] "Housing Starts"
>> [2,] "Housing Starts"
>> [3,] "Housing Starts"
>> > Ecdat.data[MedExp2,]
>> Package LibPath Item
>> [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp"
>> [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp (VietNamH)"
>> Title
>> [1,] "Structure of Demand for Medical Care"
>> [2,] "Structure of Demand for Medical Care"
>> > library(Ecdat)
>> > (data(Intratesm))
>> [1] "Intratesm"
>> > (data(Intratesq))
>> [1] "Intratesq"
>>
>>
>>> Bill Dunlap
>>> Spotfire, TIBCO Software
>>> wdunlap tibco.com
>>>
>>>
>>>> -----Original Message-----
>>>> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On
>> Behalf
>>>> Of Spencer Graves
>>>> Sent: Tuesday, May 21, 2013 12:21 PM
>>>> To: Prof Brian Ripley
>>>> Cc: r-devel at r-project.org
>>>> Subject: Re: [Rd] Objects created by more than one data call?
>>>>
>>>> On 5/21/2013 9:03 AM, Prof Brian Ripley wrote:
>>>>> On 21/05/2013 16:51, Spencer Graves wrote:
>>>>>> On 5/21/2013 7:47 AM, Prof Brian Ripley wrote:
>>>>>>> On 21/05/2013 15:28, Spencer Graves wrote:
>>>>>>>> On 5/20/2013 10:10 PM, Prof Brian Ripley wrote:
>>>>>>>>> On 21/05/2013 00:12, Spencer Graves wrote:
>>>>>>>>>> Hello, All:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If I use LazyData with the Ecdat package on R-Forge, "R CMD
>>>>>>>>>> check" reports "no visible binding for global variable
>>>>>>>>>> 'nonEnglishNames'", where 'nonEnglishNames' is a dataset in Ecdat
>>>>>>>>>> used
>>>>>>>>>> as the default argument for a function. With LazyData, that NOTE
>>>>>>>>>> disappears. However, then I get, "Warning: objects 'Hstarts',
>>>>>>>>>> 'Hstarts', 'MedExp' are created by more than one data call".
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What do you suggest I do to fix this problem?
>>>>>>>>> Not create the objects in more than one data() call.
>>>>>>>>>
>>>>>>>>> Check what each of your data() calls produces.
>>>>>>>> Thanks. How do I do that?
>>>>>>> Call data() on each in turn, and see what files get added to an empty
>>>>>>> workspace.
>>>>>> Like the following?
>>>>> You missed the 'empty'. Look at tools:::data2LazyLoadDB to see how
>>>>> this is checked.
>>>> Thanks for the suggestion. Unfortunately, I tried that function,
>>>> including stepping through it line by line, fixing references to other
>>>> functions not exported from tools, without enlightenment; see below.
>>>>
>>>>
>>>> Thanks again,
>>>> Spencer
>>>>
>>>>
>>>> > lib.loc = NULL
>>>> > package='Ecdat'
>>>> > pkgpath <- find.package(package, lib.loc, quiet = TRUE)
>>>> > pkgpath
>>>> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat"
>>>> > dataDir <- file.path(pkgpath, "data")
>>>> > dataDir
>>>> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data"
>>>> > enc <- tools:::.read_description(file.path(pkgpath,
>>>> "DESCRIPTION"))["Encoding"]
>>>> > enc
>>>> <NA>
>>>> NA
>>>> > if (!is.na(enc)) {
>>>> + op <- options(encoding = enc)
>>>> + on.exit(options(encoding = op[[1L]]))
>>>> + }
>>>> > file_test("-d", dataDir)
>>>> [1] TRUE
>>>> > file.path(dataDir, "Rdata.rds")
>>>> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data/Rdata.rds"
>>>> > (file.exists(file.path(dataDir, "Rdata.rds")) &&
>>>> file.exists(file.path(dataDir,
>>>> + paste(package, "rdx", sep = "."))) &&
>>>> file.exists(file.path(dataDir,
>>>> + paste(package, "rdb", sep = "."))))
>>>> [1] FALSE
>>>> > file.exists(file.path(dataDir,
>>>> + paste(package, "rdx", sep = ".")))
>>>> [1] FALSE
>>>> > file.path(dataDir,
>>>> + paste(package, "rdx", sep = "."))
>>>> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data/Ecdat.rdx"
>>>> > dataEnv <- new.env(hash = TRUE)
>>>> > tmpEnv <- new.env()
>>>> > f0 <- files <- list_files_with_type(dataDir, "data")
>>>> Error: could not find function "list_files_with_type"
>>>> > f0 <- files <- tools:::list_files_with_type(dataDir, "data")
>>>> > files <- unique(basename(file_path_sans_ext(files,
>>>> + TRUE)))
>>>> Error in basename(file_path_sans_ext(files, TRUE)) :
>>>> could not find function "file_path_sans_ext"
>>>> > files <- unique(basename(tools:::file_path_sans_ext(files,
>>>> + TRUE)))
>>>> > dlist <- vector("list", length(files))
>>>> > files
>>>> character(0)
>>>> > names(dlist) <- files
>>>> > loaded <- character(0L)
>>>> > loaded
>>>> character(0)
>>>> > for (f in files) {
>>>> + utils::data(list = f, package = package, lib.loc =
>>>> lib.loc,
>>>> + envir = dataEnv)
>>>> + utils::data(list = f, package = package, lib.loc =
>>>> lib.loc,
>>>> + envir = tmpEnv)
>>>> + tmp <- ls(envir = tmpEnv, all.names = TRUE)
>>>> + rm(list = tmp, envir = tmpEnv)
>>>> + dlist[[f]] <- tmp
>>>> + loaded <- c(loaded, tmp)
>>>> + }
>>>> > dup <- duplicated(loaded)
>>>> > dup
>>>> logical(0)
>>>> > if (any(dup))
>>>> + warning(sprintf(ngettext(sum(dup), "object %s is
>>>> created by more than one data call",
>>>> + "objects %s are created by more than one data call"),
>>>> + paste(sQuote(loaded[dup]), collapse = ", ")),
>>>> + call. = FALSE, domain = NA)
>>>> > if (length(loaded)) {
>>>> + dbbase <- file.path(dataDir, "Rdata")
>>>> + makeLazyLoadDB(dataEnv, dbbase, compress = compress)
>>>> + saveRDS(dlist, file.path(dataDir, "Rdata.rds"),
>>>> + compress = compress)
>>>> + unlink(f0)
>>>> + if (file.exists(file.path(dataDir, "filelist")))
>>>> + unlink(file.path(dataDir, c("filelist", "Rdata.zip")))
>>>> + }
>>>> >
>>>>
>>>>>> > library(Ecdat)
>>>>>> > objects()
>>>>>> character(0)
>>>>>> > (data(Hstarts))
>>>>>> [1] "Hstarts"
>>>>>> > (data(MedExp))
>>>>>> [1] "MedExp"
>>>>>> > objects()
>>>>>> [1] "Hstarts" "MedExp"
>>>>>> > sessionInfo()
>>>>>> R version 3.0.0 (2013-04-03)
>>>>>> Platform: i386-w64-mingw32/i386 (32-bit)
>>>>>>
>>>>>> locale:
>>>>>> [1] LC_COLLATE=English_United States.1252
>>>>>> [2] LC_CTYPE=English_United States.1252
>>>>>> [3] LC_MONETARY=English_United States.1252
>>>>>> [4] LC_NUMERIC=C
>>>>>> [5] LC_TIME=English_United States.1252
>>>>>>
>>>>>> attached base packages:
>>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>>
>>>>>> other attached packages:
>>>>>> [1] Ecdat_0.2-3
>>>>>>
>>>>>> loaded via a namespace (and not attached):
>>>>>> [1] tools_3.0.0
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Spencer
>>>>>>
>>>>>>>> In the "man" directory, I just did "grep 'data(MedExp' *.Rd",
>>>>>>>> which identified only "MedExp.Rd:\usage{data(MedExp)}"; "grep
>>>>>>>> 'data(Hstarts *.Rd" similarly returned only
>>>>>>>> "Hstarts.Rd:\usage(data(Hstarts)}".
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks again for the reply.
>>>>>>>> Spencer
>>>>>>>>>> Thanks,
>>>>>>>>>> Spencer Graves
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> > sessionInfo()
>>>>>>>>>> R version 3.0.0 (2013-04-03)
>>>>>>>>>> Platform: i386-w64-mingw32/i386 (32-bit)
>>>>>>>>>>
>>>>>>>>>> locale:
>>>>>>>>>> [1] LC_COLLATE=English_United States.1252
>>>>>>>>>> [2] LC_CTYPE=English_United States.1252
>>>>>>>>>> [3] LC_MONETARY=English_United States.1252
>>>>>>>>>> [4] LC_NUMERIC=C
>>>>>>>>>> [5] LC_TIME=English_United States.1252
>>>>>>>>>>
>>>>>>>>>> attached base packages:
>>>>>>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>>>>>>
>>>>>>>>>> other attached packages:
>>>>>>>>>> [1] Ecdat_0.2-3
>>>>>>>>>>
>>>>>>>>>> loaded via a namespace (and not attached):
>>>>>>>>>> [1] tools_3.0.0
>>>>>>>>>>
>>>>>>>>>> ______________________________________________
>>>>>>>>>> R-devel at r-project.org mailing list
>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list