[R] using sqldf's read.csv.sql to read a file with "NA" for missing
David Reiner
David.Reiner at xrtrading.com
Wed Sep 5 23:40:05 CEST 2012
I see the attachment didn't go through.
here is the text:
2012-07-01,15:50:00,NA,0,NA,0,NA
2012-07-01,15:51:00,NA,0,NA,0,NA
2012-07-01,15:52:00,NA,0,NA,0,NA
2012-07-01,15:53:00,NA,0,NA,0,NA
2012-07-01,15:54:00,NA,0,NA,0,NA
2012-07-01,15:55:00,NA,0,NA,0,NA
2012-07-01,15:56:00,NA,0,NA,0,NA
2012-07-01,15:57:00,NA,0,NA,0,NA
2012-07-01,15:58:00,NA,0,NA,0,NA
2012-07-01,15:59:00,NA,0,NA,0,NA
2012-07-01,16:00:00,1358.5,1,1358.5,4,NA
2012-07-01,16:01:00,NA,0,1358.25,2,NA
2012-07-01,16:02:00,NA,0,1358.25,2,NA
2012-07-01,16:03:00,NA,0,1358.25,2,NA
2012-07-01,16:04:00,NA,0,1358.25,2,NA
2012-07-01,16:05:00,NA,0,1358.25,2,NA
2012-07-01,16:06:00,1358.25,2,1358.25,2,NA
2012-07-01,16:07:00,1358.25,2,1358.25,2,NA
2012-07-01,16:08:00,1358,3,1358,2,NA
2012-07-01,16:09:00,1358.25,2,1358.25,2,NA
2012-07-01,16:10:00,1358.25,2,1358.25,2,NA
2012-07-01,16:11:00,1358.25,2,1358.25,2,NA
2012-07-01,16:12:00,1357.5,3,1357.5,3,NA
2012-07-01,16:13:00,1357.25,10,1357.25,3,1357.25
2012-07-01,16:14:00,NA,0,1357.25,10,1357.125
2012-07-01,16:15:00,1357,20,1357,18,1357
2012-07-01,16:16:00,1357,20,1357,18,1357
2012-07-01,16:17:00,1357,20,1357,18,1357
2012-07-01,16:18:00,1357,20,1357,18,1357
2012-07-01,16:19:00,1357,20,1357,18,1357
2012-07-01,16:20:00,1357,20,1357,18,1357
2012-07-01,16:21:00,1357,20,1357,18,1357
-- David
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of David Reiner
Sent: Wednesday, September 05, 2012 4:32 PM
To: r-help at r-project.org
Subject: [R] using sqldf's read.csv.sql to read a file with "NA" for missing
I'm trying to use sqdf's function read.csv.sql to read CSV files in which the missing values are represented by NA's.
Plain old read.csv works fine on these files, but they are rather large and I'd like to filter using sql-like statements.
However, even if I specify field.types correctly and nrows=-1, it still turns the columns with NA's into chars or 0.
I'm trying to make this OS independent, so I don't think I can use a filter to convert the NA's to NULL's or whatever SQLite would understand.
I can accept it everything has to be read in as char and then convert to doubles with as.numeric, but I'm looking for speed.
Here is code I thought would read the file (I've attached a small sample.)
It almost works if there are no NA's in the initial rows, but it still turns NA's into 0's instead of NA or something I can change into NA;
and it returns characters if there are NA's in the initial rows.
(0 is a possible value so I can't filter out the 0's.)
field.types <- list(V1='char', V2='char', V3='real', V4='int', V5='real', V6='int', V7='real')
dtst <- read.csv.sql("./tmp.csv", header=FALSE, field.types=field.types, nrows=-1)
str(dtst)
'data.frame': 32 obs. of 7 variables:
$ V1: chr "2012-07-01" "2012-07-01" "2012-07-01" "2012-07-01" ...
$ V2: chr "15:50:00" "15:51:00" "15:52:00" "15:53:00" ...
$ V3: chr "NA" "NA" "NA" "NA" ...
$ V4: int 0 0 0 0 0 0 0 0 0 0 ...
$ V5: chr "NA" "NA" "NA" "NA" ...
$ V6: int 0 0 0 0 0 0 0 0 0 0 ...
$ V7: chr "NA" "NA" "NA" "NA" ...
Any suggestions?
Thanks,
-- David L. Reiner
This e-mail and any materials attached hereto, including, without limitation, all content hereof and thereof (collectively, "XR Content") are confidential and proprietary to XR Trading, LLC ("XR") and/or its affiliates, and are protected by intellectual property laws. Without the prior written consent of XR, the XR Content may not (i) be disclosed to any third party or (ii) be reproduced or otherwise used by anyone other than current employees of XR or its affiliates, on behalf of XR or its affiliates.
THE XR CONTENT IS PROVIDED AS IS, WITHOUT REPRESENTATIONS OR WARRANTIES OF ANY KIND. TO THE MAXIMUM EXTENT PERMISSIBLE UNDER APPLICABLE LAW, XR HEREBY DISCLAIMS ANY AND ALL WARRANTIES, EXPRESS AND IMPLIED, RELATING TO THE XR CONTENT, AND NEITHER XR NOR ANY OF ITS AFFILIATES SHALL IN ANY EVENT BE LIABLE FOR ANY DAMAGES OF ANY NATURE WHATSOEVER, INCLUDING, BUT NOT LIMITED TO, DIRECT, INDIRECT, CONSEQUENTIAL, SPECIAL AND PUNITIVE DAMAGES, LOSS OF PROFITS AND TRADING LOSSES, RESULTING FROM ANY PERSON'S USE OR RELIANCE UPON, OR INABILITY TO USE, ANY XR CONTENT, EVEN IF XR IS ADVISED OF THE POSSIBILITY OF SUCH DAMAGES OR IF SUCH DAMAGES WERE FORESEEABLE.
More information about the R-help
mailing list