[Rd] as.data.frame.table() does not recognize default.stringsAsFactors()

Mychaleckyj, Josyf C (jcm6t) jcm6t @end|ng |rom v|rg|n|@@edu
Tue Mar 12 21:39:31 CET 2019


Reporting a possible inconsistency or bug in handling stringsAsFactors in as.data.frame.table()

Here is a simple test

> options()$stringsAsFactors
[1] TRUE
> x<-c("a","b","c","a","b")
> d<-as.data.frame(table(x))
> d
  x Freq
1 a    2
2 b    2
3 c    1
> class(d$x)
[1] "factor"
> d2<-as.data.frame(table(x),stringsAsFactors=F)
> class(d2$x)
[1] “character"
> options(stringsAsFactors=F)
> options()$stringsAsFactors
[1] FALSE
> d3<-as.data.frame(table(x))
> d3
  x Freq
1 a    2
2 b    2
3 c    1
> class(d3$x)
[1] “factor"
> d4<-as.data.frame(table(x),stringsAsFactors=F)
> class(d4$x)
[1] “character"


# Display the code showing the different  stringsAsFactors handling in table and matrix:

> as.data.frame.table
function (x, row.names = NULL, ..., responseName = "Freq", stringsAsFactors = TRUE,
    sep = "", base = list(LETTERS))
{
    ex <- quote(data.frame(do.call("expand.grid", c(dimnames(provideDimnames(x,
        sep = sep, base = base)), KEEP.OUT.ATTRS = FALSE, stringsAsFactors = stringsAsFactors)),
        Freq = c(x), row.names = row.names))
    names(ex)[3L] <- responseName
    eval(ex)
}
<bytecode: 0x28769f8>
<environment: namespace:base>

> as.data.frame.matrix
function (x, row.names = NULL, optional = FALSE, make.names = TRUE,
    ..., stringsAsFactors = default.stringsAsFactors())
{
    d <- dim(x)
    nrows <- d[[1L]]
    ncols <- d[[2L]]
    ic <- seq_len(ncols)
    dn <- dimnames(x)
    if (is.null(row.names))
        row.names <- dn[[1L]]
    collabs <- dn[[2L]]
    if (any(empty <- !nzchar(collabs)))
        collabs[empty] <- paste0("V", ic)[empty]
    value <- vector("list", ncols)
    if (mode(x) == "character" && stringsAsFactors) {
        for (i in ic) value[[i]] <- as.factor(x[, i])
    }
    else {
        for (i in ic) value[[i]] <- as.vector(x[, i])
    }
    autoRN <- (is.null(row.names) || length(row.names) != nrows)
    if (length(collabs) == ncols)
        names(value) <- collabs
    else if (!optional)
        names(value) <- paste0("V", ic)
    class(value) <- "data.frame"
    if (autoRN)
        attr(value, "row.names") <- .set_row_names(nrows)
    else .rowNamesDF(value, make.names = make.names) <- row.names
    value
}
<bytecode: 0x29995c0>
<environment: namespace:base>


> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /usr/lib64/libblas.so.3.4.2
LAPACK: /usr/lib64/liblapack.so.3.4.2

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.5.2 tools_3.5.2

Thanks,
Joe



	[[alternative HTML version deleted]]



More information about the R-devel mailing list