[Rd] csv version of data in an R object

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Sat Apr 21 21:59:50 CEST 2012


On Sat, Apr 21, 2012 at 3:28 PM, Max Kuhn <mxkuhn at gmail.com> wrote:
> For a package, I need to write a csv version of a data set to an R
> object. Right now, I use:
>
>    out <- capture.output(
>                          write.table(x,
>                                      sep = ",",
>                                      na = "?",
>                                      file = "",
>                                      quote = FALSE,
>                                      row.names = FALSE,
>                                      col.names = FALSE))
>
> To me, this is fairly slow; 131 seconds for a data frame with 8100
> rows and 1400 columns.
>
> The data will be in a data frame; I know write.table() would be faster
> with a matrix. I was looking into converting the data frame to a
> character matrix using as.matrix() or, better yet, format() prior to
> the call above. However, I'm not sure what an appropriate value of
> 'digits' should be so that the character version of numeric data has
> acceptable fidelity.
>
> I also tried using a text connection and sink() as shown in
> ?textConnection but there was no speedup.
>

 You could try a loop over each row, and use 'paste' to join each
element in a row by commas. Then use 'paste' again to join everything
you've got (a vector of rows) by a '\n' character.

something like: paste(apply(x,1,paste,collapse=","),collapse="\n")   # untested

you probably also want to stick a final \n on it.

Is it faster? I don't know!

Barry



More information about the R-devel mailing list