[R-SIG-Finance] Does getSymbols.csv() take a "format" argument?
Stergios Marinopoulos
stergios_marinopoulos at yahoo.com
Sun Mar 4 07:06:29 CET 2012
I having trouble getting getSymbols() to work with a CSV file. I am not sure
if this is a bug report, or my lack of understanding on how to pass the format
argument to getSymbols.csv(). Please read on.
I am running R-2.14.1 on WinXp, xts 0.8-2 and zoo 1.7-6
I noticed the docs mentioned this not being tested on windows, so I went ahead
dug into the problem a little bit.
getSymbols() reads the file in, but the dates for every row are all set to
todays date. Now the data I am working with is from 1999, so that's the heart
of the problem; why is the date getting lost?
Here is a log of my work, mostly reproduceable:
# Data as found in file EURUSD.csv: (see attached for full file)
#
# Date,Open,High,Low,Close,Volume,Adjusted
# 1999-01-04,1.174,1.189,1.174,1.177,1,0
# 1999-01-05,1.183,1.183,1.175,1.177,1,0
# .... rows snipped
# 1999-01-28,1.1441,1.1468,1.1383,1.1412,1,0
# 1999-01-29,1.1413,1.143,1.1343,1.1357,1,0
library(quantmod) ;
setwd("C:/storage") ; # set as necessary
getSymbols("EURUSD", src='csv') ;
# How did the dates get mangled???
tail(EURUSD)
EURUSD.Open EURUSD.High EURUSD.Low EURUSD.Close EURUSD.Volume EURUSD.Adjusted
2012-03-03 1.1593 1.1625 1.1562 1.1585 1 0
2012-03-03 1.1585 1.1609 1.1549 1.1549 1 0
2012-03-03 1.1551 1.1588 1.1551 1.1560 1 0
2012-03-03 1.1561 1.1561 1.1411 1.1423 1 0
2012-03-03 1.1441 1.1468 1.1383 1.1412 1 0
2012-03-03 1.1413 1.1430 1.1343 1.1357 1 0
So I dug into getSymbols.csv() a little bit and managed to get the data file
loaded fine by parroting the guts of getSymbols.csv() as follows:
############################################################
## THIS WORKS
library(quantmod) ;
setwd("C:/storage") ; # set as necessary
EURUSD <-read.csv("C:/storage/EURUSD.csv") ;
EURUSD <- xts(EURUSD[, -1], as.Date(EURUSD[, 1]) , src = "csv", updated = Sys.time())
colnames(EURUSD) <- paste(toupper(gsub("\\^", "", "EURUSD" )),
c("Open", "High", "Low", "Close", "Volume", "Adjusted"),
sep = ".")
> tail(EURUSD)
EURUSD.Open EURUSD.High EURUSD.Low EURUSD.Close EURUSD.Volume EURUSD.Adjusted
1999-01-22 1.1593 1.1625 1.1562 1.1585 1 0
1999-01-25 1.1585 1.1609 1.1549 1.1549 1 0
1999-01-26 1.1551 1.1588 1.1551 1.1560 1 0
1999-01-27 1.1561 1.1561 1.1411 1.1423 1 0
1999-01-28 1.1441 1.1468 1.1383 1.1412 1 0
1999-01-29 1.1413 1.1430 1.1343 1.1357 1 0
Now EURUSD looks as expected.
############################################################
One difference from my parroted code and that found in getSymbols.csv() is I did
not use the format argument when constructing the xts object from the data.frame.
getSymbols.csv() uses this code:
fr <- xts(fr[, -1], as.Date(fr[, 1], format = format,
..., origin = "1970-01-01"), src = "csv", updated = Sys.time())
while I did this:
EURUSD <- xts(EURUSD[, -1], as.Date(EURUSD[, 1]) , src = "csv", updated = Sys.time())
From what I could follow in the getSymbols.csv() code the format is getting
set to the empty string ""
Now if I plug in an empty format to my code as follows I get the same screwy munging of the dates:
EURUSD <- xts(EURUSD[, -1], as.Date(EURUSD[, 1], format = "",
, origin = "1970-01-01"), src = "csv", updated = Sys.time())
> tail(EURUSD)
Open High Low Close Volume Adjusted
2012-03-03 1.1593 1.1625 1.1562 1.1585 1 0
2012-03-03 1.1585 1.1609 1.1549 1.1549 1 0
2012-03-03 1.1551 1.1588 1.1551 1.1560 1 0
2012-03-03 1.1561 1.1561 1.1411 1.1423 1 0
2012-03-03 1.1441 1.1468 1.1383 1.1412 1 0
2012-03-03 1.1413 1.1430 1.1343 1.1357 1 0
Ah, ha! Seems like format is the key.
And if I use a proper format (or no format at all) to the xts call it seems to
work as expected:
EURUSD <- xts(EURUSD[, -1], as.Date(EURUSD[, 1], format = "%Y-%m-%d",
, origin = "1970-01-01"), src = "csv", updated = Sys.time())
> tail(EURUSD)
Open High Low Close Volume Adjusted
1999-01-22 1.1593 1.1625 1.1562 1.1585 1 0
1999-01-25 1.1585 1.1609 1.1549 1.1549 1 0
1999-01-26 1.1551 1.1588 1.1551 1.1560 1 0
1999-01-27 1.1561 1.1561 1.1411 1.1423 1 0
1999-01-28 1.1441 1.1468 1.1383 1.1412 1 0
1999-01-29 1.1413 1.1430 1.1343 1.1357 1 0
That looks good.
One last note. I tried adding the format argument to getSymbols() thinking it would get passed along to getSymbols.csv() but the following error message was produced:
getSymbols("EURUSD", src='csv', format = "%Y-%m-%d" ) ;
Error in list(...)[["format"]] <- NULL :
'...' used in an incorrect context
Any guidance offered would be greatly appreciated.
--
Stergios Marinopoulos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20120303/2d33d959/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: EURUSD.csv
Type: application/octet-stream
Size: 941 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20120303/2d33d959/attachment.obj>
More information about the R-SIG-Finance
mailing list