[R] How to import and create time series data frames in an efficient way?

Nhan La |@th@nhnh@n @end|ng |rom gm@||@com
Fri Nov 15 00:04:43 CET 2019


I have many separate data files in csv format for a lot of daily stock
prices. Over a few years there are hundreds of those data files, whose
names are the dates of data record.

In each file there are variables of ticker (or stock trading code), date,
open price, high price, low price, close price, and trading volume. For
example, inside a data file named 20150128.txt it looks like this:

FB,20150128,1.075,1.075,0.97,0.97,725221
AAPL,20150128,2.24,2.24,2.2,2.24,63682
AMZN,20150128,0.4,0.415,0.4,0.415,194900
NFLX,20150128,50.19,50.21,50.19,50.19,761845
GOOGL,20150128,1.62,1.645,1.59,1.63,684835 ...................and many
more..................

In case it's relevant, the number of stocks in these files are not
necessarily the same (so there will be missing data). I need to import and
create 5 separate time series data frames from those files, one each for
Open, High, Low, Close and Volume. In each data frame, rows are indexed by
date, and columns by ticker. For example, the data frame Open may look like
this:

DATE,FB,AAPL,AMZN,NFLX,GOOGL,... 20150128,1.5,2.2,0.4,5.1,1.6,...
20150129,NA,2.3,0.5,5.2,1.7,... ...

What will be an efficient way to do that? I've used the following codes to
read the files into a list of data frames but don't know what to do next
from here.

files = list.files(pattern="*.txt") mydata = lapply(files,
read.csv,head=FALSE)

Thanks,

Nathan

Disclaimer: In case it's relevant, this question is also posted on
stackoverflow.

	[[alternative HTML version deleted]]



More information about the R-help mailing list