[Rd] inconsistent behavior for logical vectors when using apply (" TRUE")

Tony Plate tplate at acm.org
Wed Nov 4 21:25:43 CET 2009


This happens in as.matrix(), which gets called by apply().

When you've got a mixed-mode dataframe like this, as.matrix() converts 
everything to character.  But, the rules it uses for each column don't 
seem to be entirely consistent regarding whether columns are 
space-padded to make each element have the same number of characters.

The way it works out, logical mode columns are passed through format(), 
which space-pads by default.  Factor columns are passed through 
as.vector(), and character-mode columns are left alone.  The result is 
that some columns come out space-padded, and some don't, depending on 
their original mode.

To get greater control of this, you theoretically should be able to do 
something like apply(as.matrix(format(X, justify="none")), ...),
except that format() seems to ignore the justify argument for logical 
vectors, e.g.:
 > format(c(T,F,T))
[1] " TRUE" "FALSE" " TRUE"
 > format(c(T,F,T), justify="none")
[1] " TRUE" "FALSE" " TRUE"
 >

If it's really important for you to get this to work the way you want, 
you can convert the logical column of the data frame using as.character 
(see the end of the example below).

Here's an example that shows probably far more than you wanted to know:

 > X <- data.frame(letters=letters[1:3], flag=c(TRUE, FALSE, TRUE), 
codef=c("a","ab","abcd"), codec=I(c("x", "xy", "xyz")))
 > sapply(X, class)
  letters      flag     codef     codec
 "factor" "logical"  "factor"    "AsIs"
 > as.matrix(X)
     letters flag    codef  codec
[1,] "a"     " TRUE" "a"    "x" 
[2,] "b"     "FALSE" "ab"   "xy"
[3,] "c"     " TRUE" "abcd" "xyz"
 > unclass(format(X))
$letters
[1] "a" "b" "c"

$flag
[1] " TRUE" "FALSE" " TRUE"

$codef
[1] "a"    "ab"   "abcd"

$codec
[1] "x"   "xy"  "xyz"

attr(,"row.names")
[1] "1" "2" "3"
 > unclass(format(X, justify="left"))
$letters
[1] "a" "b" "c"

$flag
[1] " TRUE" "FALSE" " TRUE"

$codef
[1] "a   " "ab  " "abcd"

$codec
[1] "x  " "xy " "xyz"

attr(,"row.names")
[1] "1" "2" "3"
 >
 > # The only way I can see to get the logical column converted to 
character without padding:
 > X1 <- X
 > X1$flag <- as.character(X1$flag)
 > as.matrix(X1)
     letters flag    codef  codec
[1,] "a"     "TRUE"  "a"    "x" 
[2,] "b"     "FALSE" "ab"   "xy"
[3,] "c"     "TRUE"  "abcd" "xyz"
 >

Adrian Dragulescu wrote:
>
> Hello,
>
>> X <- data.frame(letters=letters[1:3], flag=c(TRUE, FALSE, TRUE))
>> X
>   letters  flag
> 1       a  TRUE
> 2       b FALSE
> 3       c  TRUE
>> apply(X, 1, as.list)
> [[1]]
> [[1]]$letters
> [1] "a"
>
> [[1]]$flag
> [1] " TRUE"
>
>
> [[2]]
> [[2]]$letters
> [1] "b"
>
> [[2]]$flag
> [1] "FALSE"
>
>
> [[3]]
> [[3]]$letters
> [1] "c"
>
> [[3]]$flag
> [1] " TRUE"
>
> Notice how TRUE becomes " TRUE" and FALSE becomes "FALSE".  Not sure 
> why TRUE gets an extra whitespace in front.
>
> Checked with R-2.10.0, but can reproduce the behavior as far back as 
> R-2.8.1.
>
> Adrian Dragulescu
>
>> sessionInfo()
> R version 2.10.0 (2009-10-26)
> i386-pc-mingw32
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
> States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.0
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list