[R] bug(?) in str() with strict.width = "cut" when applied to dataframe with numeric component AND factor or character component with longerlevels/strings

Gerrit Eichner Gerrit.Eichner at math.uni-giessen.de
Tue Oct 15 13:53:12 CEST 2013


Dear list subscribers,

here is a small artificial example to demonstrate the problem that I 
encountered when looking at the structure of a (larger) data frame that 
comprised (among other components)

a numeric component of elements of the order of > 10000, and

a factor or character component with longer levels/strings:


k <- 43      # length of levels or character strings
n <- 11      # number of rows of data frame
M <- 10000   # order of magnitude of numerical values

set.seed( 47) # to reproduce the following artificial character string
longer.char.string <- paste( sample( letters, k, replace = TRUE),
                              collapse = "")

X <- data.frame( A = 1:n * M,
                  B = rep( longer.char.string, n))


The following call to str() gives apparently a wrong result

str( X, strict.width = "cut")

'data.frame':   11 obs. of  2 variables:
  $ A: num  1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
  $ A: num  1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..


whereas the correct result appears for str( X) or if you decrease k to 42 
(isn't that "the answer"? ;-) ) or n to 10 or M to 1000 (or smaller, 
respectively).


I tried to dig into the entrails of str.default(), where the cause may 
lie, but got lost pretty soon. So, I am hoping that someone may already 
have a work-around or patch (or dares to dig further)? Thank you for any 
feedback!

  Best regards  --  Gerrit

PS:

> sessionInfo()

R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets
[7] methods   base

other attached packages:
[1] nparcomp_2.0     multcomp_1.2-21  mvtnorm_0.9-9996
[4] car_2.0-19       Hmisc_3.12-2     Formula_1.1-1
[7] survival_2.37-4  fortunes_1.5-0

loaded via a namespace (and not attached):
[1] cluster_1.14.4  grid_3.0.2      lattice_0.20-23 MASS_7.3-29
[5] nnet_7.3-7      rpart_4.1-3     stats4_3.0.2    tools_3.0.2

---------------------------------------------------------------------
Dr. Gerrit Eichner                   Mathematical Institute, Room 212
gerrit.eichner at math.uni-giessen.de   Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104          Arndtstr. 2, 35392 Giessen, Germany
Fax: +49-(0)641-99-32109        http://www.uni-giessen.de/cms/eichner



More information about the R-help mailing list