[R-SIG-Finance] Does getSymbols.csv() take a "format" argument?

Stergios Marinopoulos stergios_marinopoulos at yahoo.com
Sun Mar 4 07:06:29 CET 2012


I having trouble getting getSymbols() to work with a CSV file.  I am not sure
if this is a bug report, or my lack of understanding on how to pass the format
argument to getSymbols.csv().  Please read on.

I am running R-2.14.1 on WinXp, xts 0.8-2 and zoo 1.7-6

I noticed the docs mentioned this not being tested on windows, so I went ahead
dug into the problem a little bit.

getSymbols() reads the file in, but the dates for every row are all set to
todays date.  Now the data I am working with is from 1999, so that's the heart
of the problem; why is the date getting lost? 

Here is a log of my work, mostly reproduceable:

# Data as found in file EURUSD.csv: (see attached for full file)
#
# Date,Open,High,Low,Close,Volume,Adjusted
# 1999-01-04,1.174,1.189,1.174,1.177,1,0
# 1999-01-05,1.183,1.183,1.175,1.177,1,0
#  .... rows snipped
# 1999-01-28,1.1441,1.1468,1.1383,1.1412,1,0
# 1999-01-29,1.1413,1.143,1.1343,1.1357,1,0

library(quantmod) ;
setwd("C:/storage") ; # set as necessary
getSymbols("EURUSD", src='csv') ;


# How did the dates get mangled???
tail(EURUSD) 

           EURUSD.Open EURUSD.High EURUSD.Low EURUSD.Close EURUSD.Volume EURUSD.Adjusted
2012-03-03      1.1593      1.1625     1.1562       1.1585             1               0
2012-03-03      1.1585      1.1609     1.1549       1.1549             1               0
2012-03-03      1.1551      1.1588     1.1551       1.1560             1               0
2012-03-03      1.1561      1.1561     1.1411       1.1423             1               0
2012-03-03      1.1441      1.1468     1.1383       1.1412             1               0
2012-03-03      1.1413      1.1430     1.1343       1.1357             1               0




So I dug into getSymbols.csv() a little bit and managed to get the data file
loaded fine by parroting the guts of getSymbols.csv() as follows:



############################################################
## THIS WORKS
library(quantmod) ;
setwd("C:/storage") ; # set as necessary
EURUSD <-read.csv("C:/storage/EURUSD.csv") ;


EURUSD <- xts(EURUSD[, -1], as.Date(EURUSD[, 1]) , src = "csv", updated = Sys.time())


colnames(EURUSD) <- paste(toupper(gsub("\\^", "", "EURUSD"  )), 
            c("Open", "High", "Low", "Close", "Volume", "Adjusted"), 
            sep = ".")

> tail(EURUSD)
           EURUSD.Open EURUSD.High EURUSD.Low EURUSD.Close EURUSD.Volume EURUSD.Adjusted
1999-01-22      1.1593      1.1625     1.1562       1.1585             1               0
1999-01-25      1.1585      1.1609     1.1549       1.1549             1               0
1999-01-26      1.1551      1.1588     1.1551       1.1560             1               0
1999-01-27      1.1561      1.1561     1.1411       1.1423             1               0
1999-01-28      1.1441      1.1468     1.1383       1.1412             1               0
1999-01-29      1.1413      1.1430     1.1343       1.1357             1               0

Now EURUSD looks as expected.

############################################################


One difference from my parroted code and that found in getSymbols.csv() is I did
not use the format argument when constructing the xts object from the data.frame.

getSymbols.csv() uses this code:

fr <- xts(fr[, -1], as.Date(fr[, 1], format = format, 
    ..., origin = "1970-01-01"), src = "csv", updated = Sys.time())


while I did this:

EURUSD <- xts(EURUSD[, -1], as.Date(EURUSD[, 1]) , src = "csv", updated = Sys.time())

 From what I could follow in the getSymbols.csv() code the format is getting
set to the empty string ""

Now if I plug in an empty format to my code as follows I get the same screwy munging of the dates:

EURUSD <- xts(EURUSD[, -1], as.Date(EURUSD[, 1], format = "", 
            , origin = "1970-01-01"), src = "csv", updated = Sys.time())


> tail(EURUSD)
             Open   High    Low  Close Volume Adjusted
2012-03-03 1.1593 1.1625 1.1562 1.1585      1        0
2012-03-03 1.1585 1.1609 1.1549 1.1549      1        0
2012-03-03 1.1551 1.1588 1.1551 1.1560      1        0
2012-03-03 1.1561 1.1561 1.1411 1.1423      1        0
2012-03-03 1.1441 1.1468 1.1383 1.1412      1        0
2012-03-03 1.1413 1.1430 1.1343 1.1357      1        0

Ah, ha! Seems like format is the key.

And if I use a proper format (or no format at all) to the xts call it seems to
work as expected:

EURUSD <- xts(EURUSD[, -1], as.Date(EURUSD[, 1], format = "%Y-%m-%d", 
            , origin = "1970-01-01"), src = "csv", updated = Sys.time())

> tail(EURUSD)
             Open   High    Low  Close Volume Adjusted
1999-01-22 1.1593 1.1625 1.1562 1.1585      1        0
1999-01-25 1.1585 1.1609 1.1549 1.1549      1        0
1999-01-26 1.1551 1.1588 1.1551 1.1560      1        0
1999-01-27 1.1561 1.1561 1.1411 1.1423      1        0
1999-01-28 1.1441 1.1468 1.1383 1.1412      1        0
1999-01-29 1.1413 1.1430 1.1343 1.1357      1        0


That looks good.

One last note.   I tried adding the format argument to getSymbols() thinking it would get passed along to getSymbols.csv() but the following error message was produced:

getSymbols("EURUSD", src='csv', format = "%Y-%m-%d" ) ;

Error in list(...)[["format"]] <- NULL : 
  '...' used in an incorrect context


 
Any guidance offered would be greatly appreciated.


--
Stergios Marinopoulos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20120303/2d33d959/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: EURUSD.csv
Type: application/octet-stream
Size: 941 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20120303/2d33d959/attachment.obj>


More information about the R-SIG-Finance mailing list