[R] readr to generate tibble from a character matrix
David L Carlson
dcarlson at tamu.edu
Thu Apr 6 21:34:58 CEST 2017
Ulrik's solution gives you factors. To get them as characters, add as.is=TRUE:
> m %>%
+ as_tibble() %>%
+ lapply(type.convert, as.is=TRUE) %>%
+ as_tibble()
# A tibble: 4 × 5
A B C D E
<chr> <chr> <chr> <int> <dbl>
1 a e i 1 11.2
2 b f j 2 12.2
3 c g k 3 13.2
4 d h l 4 14.2
Other possibilities:
> mm <- lapply(data.frame(m, stringsAsFactors=FALSE), type.convert, as.is=TRUE)
> as_tibble(mm)
# Your solution simplified by converting to a data.frame
> as_tibble(lapply(as_tibble(m), type.convert, as.is=TRUE))
# Ulrik's solution but without the pipes. Shows why you need 2 as_tibbles()
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ben Tupper
Sent: Thursday, April 6, 2017 11:42 AM
To: Ulrik Stervbo <ulrik.stervbo at gmail.com>
Cc: R-help Mailing List <r-help at r-project.org>
Subject: Re: [R] readr to generate tibble from a character matrix
Hi,
Thanks for this solution! Very slick!
I see what you mean about the two calls to as_tibble(). I suppose I could do the following, but I doubt it is a gain...
mm <- lapply(colnames(m), function(nm, m) type.convert(m[,nm], as.is = TRUE), m=m)
names(mm) <- colnames(m)
as_tibble(mm)
# # A tibble: 4 × 5
# A B C D E
# <chr> <chr> <chr> <int> <dbl>
# 1 a e i 1 11.2
# 2 b f j 2 12.2
# 3 c g k 3 13.2
# 4 d h l 4 14.2
I'll benchmark these with writing to a temporary file and pasting together a string.
Cheers and thanks,
Ben
On Apr 6, 2017, at 11:15 AM, Ulrik Stervbo <ulrik.stervbo at gmail.com> wrote:
>
> Hi Ben,
>
> type.convert should do the trick:
>
> m %>%
> as_tibble() %>%
> lapply(type.convert) %>%
> as_tibble()
>
> I am not too happy about to double 'as_tibble' but it get the job done.
>
> HTH
> Ulrik
>
> On Thu, 6 Apr 2017 at 16:41 Ben Tupper <btupper at bigelow.org <mailto:btupper at bigelow.org>> wrote:
> Hello,
>
> I have a workflow yields a character matrix that I convert to a tibble. Here is a simple example.
>
> library(tibble)
> library(readr)
>
> m <- matrix(c(letters[1:12], 1:4, (11:14 + 0.2)), ncol = 5)
> colnames(m) <- LETTERS[1:5]
>
> x <- as_tibble(m)
>
> # # A tibble: 4 × 5
> # A B C D E
> # <chr> <chr> <chr> <chr> <chr>
> # 1 a e i 1 11.2
> # 2 b f j 2 12.2
> # 3 c g k 3 13.2
> # 4 d h l 4 14.2
>
> The workflow output columns can be a mix of a known set column outputs. Some of the columns really should be converted to non-character types before I proceed. Right now I explictly set the column classes with something like this...
>
> mode(x[['D']]) <- 'integer'
> mode(x[['E']]) <- 'numeric'
>
> # # A tibble: 4 × 5
> # A B C D E
> # <chr> <chr> <chr> <int> <dbl>
> # 1 a e i 1 11.2
> # 2 b f j 2 12.2
> # 3 c g k 3 13.2
> # 4 d h l 4 14.2
>
>
> I wonder if there is a way to use the read_* functions in the readr package to read the character matrix into a tibble directly which would leverage readr's excellent column class guessing. I can see in the vignette ( https://cran.r-project.org/web/packages/readr/vignettes/readr.html <https://cran.r-project.org/web/packages/readr/vignettes/readr.html> ) that I'm not too far off in thinking this could be done (step 1 tantalizingly says 'The flat file is parsed into a rectangular matrix of strings.')
>
> I know that I could either write the matrix to a file or paste it all into a character vector and then use read_* functions, but I confess I am looking for a straighter path by simply passing the matrix to a function like readr::read_matrix() or the like.
>
> Thanks!
> Ben
>
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org <http://www.bigelow.org/>
>
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list