[R] Read_fwf in package readr, double vs. numeric
Doran, Harold
HDor@n @end|ng |rom @|r@org
Wed Apr 24 17:37:47 CEST 2019
Thank you, Sarah. Seems that updating to a newer version does indeed solve that problem. For completeness, below is the version in which it seems to work properly and below is the version in which I observe the problem I described.
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readr_1.3.1
loaded via a namespace (and not attached):
[1] compiler_3.5.3 assertthat_0.2.1 R6_2.4.0 cli_1.1.0 hms_0.4.2
[6] tools_3.5.3 pillar_1.3.1 tibble_2.1.1 Rcpp_1.0.1 crayon_1.3.4
[11] utf8_1.1.4 fansi_0.4.0 pkgconfig_2.0.2 rlang_0.3.4
> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readr_1.1.1
loaded via a namespace (and not attached):
[1] compiler_3.4.2 assertthat_0.2.0 R6_2.2.2 cli_1.0.0 hms_0.3 tools_3.4.2
[7] pillar_1.3.0 tibble_1.4.2 Rcpp_1.0.0 crayon_1.3.4 utf8_1.1.4 fansi_0.2.3
[13] rlang_0.3.0.1
-----Original Message-----
From: Sarah Goslee <sarah.goslee using gmail.com>
Sent: Wednesday, April 24, 2019 11:12 AM
To: Doran, Harold <HDoran using air.org>
Cc: r-help using r-project.org
Subject: Re: [R] Read_fwf in package readr, double vs. numeric
Hi,
I can't reproduce your problem: with readr 1.1.1 on linux, it works as expected. Letting read_fwf guess the types also works fine. (See
below.)
If you aren't running the current version of readr, update and retry.
If you are, then we probably need more info, at least sessionInfo().
Sarah
library(readr)
myFile <- "foo.txt"
pos <- fwf_positions(c(1,2,7), c(1,6,10))
type <- c('N','D','N')
types <- paste0(type, collapse = '')
types <- chartr('NCD', 'ncd', types)
read_fwf(file = myFile, col_positions = pos, col_types = types)
# A tibble: 3 x 3
X1 X2 X3
<dbl> <dbl> <dbl>
1 1 1.00e-20 1043
2 1 7.12e+ 4 1043
3 1 9.12e+ 4 1055
type <- c('N','N','N')
types <- paste0(type, collapse = '')
types <- chartr('NCD', 'ncd', types)
read_fwf(file = myFile, col_positions = pos, col_types = types)
# A tibble: 3 x 3
X1 X2 X3
<dbl> <dbl> <dbl>
1 1 1.00e-20 1043
2 1 7.12e+ 4 1043
3 1 9.12e+ 4 1055
> read_fwf(file = myFile, col_positions = pos, col_types = NULL)
Parsed with column specification:
cols(
X1 = col_double(),
X2 = col_double(),
X3 = col_double()
)
# A tibble: 3 x 3
X1 X2 X3
<dbl> <dbl> <dbl>
1 1 1.00e-20 1043
2 1 7.12e+ 4 1043
3 1 9.12e+ 4 1055
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Fedora 28 (Workstation Edition)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readr_1.3.1 colorout_1.2-0
loaded via a namespace (and not attached):
[1] compiler_3.5.3 assertthat_0.2.0 R6_2.4.0 cli_1.0.1
[5] hms_0.4.2 tools_3.5.3 pillar_1.3.1 tibble_2.0.1
[9] Rcpp_1.0.0 crayon_1.3.4 utf8_1.1.4 fansi_0.4.0
[13] pkgconfig_2.0.2 rlang_0.3.1
On Wed, Apr 24, 2019 at 10:56 AM Doran, Harold <HDoran using air.org> wrote:
>
> Suppose I have the following data sitting in a fwf file 'foo.txt'. The point of this email is to ask the group how to properly read in the value in this pseudo-data "1e-20" using the read_fwf function in the package readr.
>
> 11e-201043
> 1712201043
> 1912201055
>
> First, suppose I do it this way, where in this case "D" is used for double precision.
>
> library(readr)
> pos <- fwf_positions(c(1,2,7), c(1,6,10)) type <- c('N','D','N') types
> <- paste0(type, collapse = '') types <- chartr('NCD', 'ncd', types)
>
> read_fwf(file = myFile, col_positions = pos, col_types = types)
>
> # A tibble: 3 x 3
> X1 X2 X3
> <dbl> <dbl> <dbl>
> 1 1 1.00e-20 1043
> 2 1 7.12e+ 4 1043
> 3 1 9.12e+ 4 1055
>
> This seemingly works well and properly captures the value. However, if
> I instead were to indicate to the function that *all* of my columns
> were numeric (just insert this one line in lieu of the other above)
>
> type <- c('N','N','N')
>
> # A tibble: 3 x 3
> X1 X2 X3
> <dbl> <dbl> <dbl>
> 1 1 1 1043
> 2 1 71220 1043
> 3 1 91220 1055
>
> The read in is not correct. Here is the pragmatic issue. I have a legacy program that spits out the layout structure of the fwf file (start, end positions) and also indicates what the column types are. This layout file we receive always uses a column type of numeric (N) for any numeric types (including the column holding values such as 1e-20).
>
> This layout file will not change so I need to figure out how to solve the problem within my read in program. I suppose one option is that I could manually change any values of "N" to "D" in my R code. That seems to work. But not sure if that is the "right" way to solve this issue.
>
> Thanks
> Harold
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Sarah Goslee (she/her)
http://www.numberwright.com
More information about the R-help
mailing list