[Rd] what do you think about write.table(... qmethod = "excel")?

David Winsemius dwinsemius at comcast.net
Wed Sep 20 19:39:50 CEST 2017


> On Sep 20, 2017, at 9:53 AM, Paul Johnson <pauljohn32 at gmail.com> wrote:
> 
> On Tue, Sep 19, 2017 at 4:45 PM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
>> 
>> 
>> That's true, but if that's what they want to do, and they're willing to pay
>> to be able to write files that imitate Excel, then why not do what they ask?
>> 
>> On the other hand, if they aren't willing to pay for the work, then you
>> should lecture them on how silly their request is.
>> 
>> In any case, base R functions should not include nonsense, so this is not
>> something that should go into R.
>> 
>> Duncan Murdoch
>> 
> 
> I understand.  This is a paying client, I'm going where the money goes.
> 
> Here's my current working example of a function that writes a CSV
> exactly as Excel does. I've posted this into StackOverflow
> (https://stackoverflow.com/questions/25743018/how-to-conditionally-remove-quotes-in-write-csv).
> It is buried under a thread, maybe will not get much attention and
> bug-checking. Oh, well, I can hope. Processes variables
> column-by-column, I don't know how to do it differently.
> 
> 
> 
> ##' Write CSV files with quotes same as MS Excel 2013 or newer
> ##'
> ##' R's write.csv inserts quotes around all elements in a character
> ##' vector (if quote = TRUE).  In contrast, MS Excel CSV export no
> ##' longer inserts quotation marks on all elements in character
> ##' variables, except when the cells include commas or quotation
> ##' marks.  This function generates CSV files that are, so far as we
> ##' know, exactly the same "quoted style" as MS Excel CSV export
> ##' files.
> ##'
> ##' This works by manually inserting quotation marks where necessary and
> ##' turning FALSE R's own method to insert quotation marks.
> ##' @param x a data frame
> ##' @param file character string for file name
> ##' @param row.names Default FALSE for row.names
> ##' @importFrom utils write.table
> ##' @return the return from write.table, using revised quotes
> ##' @export
> ##' @author Paul Johnson
> ##' @examples
> ##' set.seed(234)
> ##' x1 <- data.frame(x1 = c("a", "b,c", "b", "The \"Washington, DC\""),
> ##'       x2 = rnorm(4), stringsAsFactors = FALSE)
> ##' x1
> ##' fn <- tempfile(pattern = "testcsv", fileext = ".csv")
> ##' writeCSV(x1, file = fn)
> ##' readLines(fn)
> ##' x2 <- read.table(fn, sep = ",", header = TRUE, stringsAsFactors = FALSE)
> ##' all.equal(x1,x2)
> writeCSV <- function(x, file, row.names = FALSE){
>    xischar <- colnames(x)[sapply(x, is.character)]

? Should that be:


    xIsCharOrFactor <- colnames(x)[sapply(x, function(z) is.character(z) | is.factor(z) )]

Just asking. No testing done.
__ 
David.

>    for(jj in xischar){
>        x[ , jj] <- gsub('"', '""', x[ , jj], fixed = TRUE)
>        needsquotes <- grep('[\",]', x[ ,jj])
>        x[needsquotes, jj] <- paste0("\"", x[needsquotes, jj], "\"")
>    }
>    write.table(x, file = file, sep = ",", quote = FALSE,
>                row.names = row.names)
> }
> 
> Output:
> 
>> set.seed(234)
>> x1 <- data.frame(x1 = c("a", "b,c", "b", "The \"Washington, DC\""),
> +        x2 = rnorm(4), stringsAsFactors = FALSE)
>> x1
>                    x1         x2
> 1                    a  0.6607697
> 2                  b,c -2.0529830
> 3                    b -1.4992061
> 4 The "Washington, DC"  1.4712331
>> fn <- tempfile(pattern = "testcsv", fileext = ".csv")
>> writeCSV(x1, file = fn)
>> readLines(fn)
> [1] "x1,x2"
> [2] "a,0.660769736644892"
> [3] "\"b,c\",-2.052983003941"
> [4] "b,-1.49920605110092"
> [5] "\"The \"\"Washington, DC\"\"\",1.4712331168047"
>> x2 <- read.table(fn, sep = ",", header = TRUE, stringsAsFactors = FALSE)
>> all.equal(x1,x2)
> [1] TRUE
> 
> I already see one problem, that I've got no special arrangement for
> column names with commas or quotes. People who want column names with
> those things are even more wrong than the people write a parser that
> can't understand quotes on character variables.
> 
> -- 
> Paul E. Johnson   http://pj.freefaculty.org
> Director, Center for Research Methods and Data Analysis http://crmda.ku.edu
> 
> To write to me directly, please address me at pauljohn at ku.edu.
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law



More information about the R-devel mailing list