[Rd] Unexplicable difference between 2 R installations regarding reading numbers

Simon Urbanek simon.urbanek at r-project.org
Mon Nov 3 16:41:19 CET 2014


R version.

NEWS for 3.1.0:

      type.convert() (and hence by default
      read.table() returns a character vector or factor when
      representing a numeric input as a double would lose accuracy.
      Similarly for complex inputs.

NEWS for 3.1.1:

      type.convert(), read.table() and similar
      read.*() functions get a new numerals argument,
      specifying how numeric input is converted when its conversion to
      double precision loses accuracy.  The default value,
      allow.loss allows accuracy loss, as in R versions before
      3.1.0.


On Nov 3, 2014, at 10:07 AM, Joris Meys <jorismeys at gmail.com> wrote:

> Dear all,
> 
> A colleague of mine reported a problem that I fail to understand
> completely. He has a number of .csv files that look all very
> straightforward, and they all read in perfectly well using read.csv() on
> both his and my computer.
> 
> When we try the exact same R version on the university server however,
> suddenly all numeric variables turn into factors. The problem is resolved
> by deleting the last digits of every number in the .csv file.  Using
> as.numeric() on the values works as well.
> 
> Anybody a clue as to what might cause this problem? If needed, I can send
> an example of a .csv file.
> 
> Example output on server:
> 
>> X <- read.csv("Originelen/Originelen/heavymetals.csv")
>> levels(X[[2]])
> [1] "11.140969600635804" "11.548972671055257" "11.98554898321271"
> [4] "16.317868213178677" "17.179218967921898" "18.596573461949852"
> [7] "18.786014405762298" "18.87978032658098"  "23.604106448719225"
> [10] "26.75482955698816"  "27.33829851044687"  "29.26619704952923"
> [13] "33.07842352705811"  "39.296270581233884" "4.8696848424212105"
> [16] "5.5751725517655295" "6.0256909109049195" "9.117975845892804"
> [19] "9.26944194868723"
>> str(X)
> 'data.frame':   19 obs. of  18 variables:
> $ ID   : int  1 2 3 4 5 6 7 8 9 10 ...
> $ Cd5  : Factor w/ 19 levels "11.140969600635804",..: 3 8 6 12 11 10 2 5
> 14 13 ...
> $ Cd20 : Factor w/ 19 levels "10.160499999999999",..: 2 8 10 12 5 6 18 9
> 11 4 ...
> $ Cr5  : Factor w/ 19 levels "118.43421710855425",..: 6 11 10 17 16 15 7
> 13 19 18 ...
> $ Cr20 : Factor w/ 19 levels "100.48101898101898",..: 9 15 14 17 13 11 6
> 16 18 12 ...
> $ Cu5  : Factor w/ 19 levels "101.8005401620486",..: 8 17 16 15 14 12 9 18
> 19 1 ...
> $ Cu20 : Factor w/ 19 levels "103.67346938775509",..: 11 18 19 2 16 17 14
> 3 4 1 ...
> $ Fe5  : Factor w/ 19 levels "17239.349496158833",..: 3 8 10 9 12 14 7 16
> 19 18 ...
> $ Fe20 : Factor w/ 19 levels "17701.77893264042",..: 3 14 16 18 10 15 6 17
> 19 13 ...
> $ Mn5  : Factor w/ 19 levels "440.37211163349",..: 10 14 4 5 3 17 2 7 18 6
> ...
> $ Mn20 : Factor w/ 19 levels "375.19156134938805",..: 12 2 6 3 1 9 11 7 8
> 5 ...
> $ Ni5  : Factor w/ 19 levels "19.54255213010077",..: 4 12 8 10 11 16 6 14
> 19 18 ...
> $ Ni20 : Factor w/ 19 levels "21.295222866280234",..: 8 13 15 18 12 16 7
> 17 19 14 ...
> $ Pb5  : Factor w/ 19 levels "125.5616926977306",..: 1 11 14 9 13 8 5 12
> 15 16 ...
> $ Pb20 : Factor w/ 19 levels "106.96930306969303",..: 3 8 11 12 9 10 4 13
> 14 15 ...
> $ Zn5  : Factor w/ 19 levels "1024.909963985594",..: 17 4 7 5 8 3 18 6 9
> 10 ...
> $ Zn20 : Factor w/ 19 levels "1247.816195886593",..: 15 4 5 7 2 1 16 6 8 3
> ...
> $ river: int  1 1 1 1 1 1 1 1 1 1 ...
> 
> Using as.numeric(levels(X[[2]])) works perfectly fine though...
> 
> Session info both server and my own computer :
> 
>> sessionInfo()
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> 
> locale:
> [1] LC_COLLATE=Dutch_Belgium.1252  LC_CTYPE=Dutch_Belgium.1252
> [3] LC_MONETARY=Dutch_Belgium.1252 LC_NUMERIC=C
> [5] LC_TIME=Dutch_Belgium.1252
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> loaded via a namespace (and not attached):
> [1] tools_3.1.0
> 
> -- 
> Joris Meys
> Statistical consultant
> 
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
> 
> tel :  +32 (0)9 264 61 79
> Joris.Meys at Ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list