[R-SIG-Finance] Older financials?

Rex Macey rex at macey.us
Sat Nov 28 18:20:29 CET 2015


A suggestion on where to get extensive fundamental data cheaply.
This is a response to Mark's Nov 23rd message.

Consider data from the American Association of Individual Investor’s 
Stock Investor Pro (SIP) software. I’ve had a lifetime membership to the 
AAII for many years. For the additional, but more than reasonable price 
of $198/yr, one can license SIP. What makes this source valuable is that 
it is survivorship-bias free historical data. Subscribers have access to 
the old software and data as it was when it was distributed going back 
to 2003. The data include balance sheet, income statement, cash flow, 
price, and many calculated fields. The list of fields 
<https://www.aaii.com/files/sipro/Stock%20Investor%20Pro%20Field%20List.pdf> runs 
to 22 pages. In 2003, over 8,500 companies were covered.
For info on SIP, check out the AAII 
<file:///C:/Users/Rex/Documents/Quant%20Trading/SMW/www.aaii.com> webpage and 
this presentation 
<http://www.aaii.com/files/presentations/2011/20%20Joe%20Lan%20-%20Introduction%20to%20Stock%20Investor%20Pro.pdf>. 
I downloaded about 150 install files from the AAII archives 
<http://www.aaii.com/stock-investor-pro/archives> page site access to 
which requires membership ($29) and a subscription. I installed them one 
by one putting each into its own directory. I downloaded the month-end 
updates though weekly data was sometime available. I watched an entire 
season of Friends while doing this and probably lost three IQ points. 
Each install includes about 7 years of annual data and 8 quarters of 
quarterly data.
The AAII data files are in a Foxpro/DBF format. Fortunately R has the 
read.dbf 
<https://stat.ethz.ch/R-manual/R-devel/library/foreign/html/read.dbf.html> function 
in the foreign package to handle this.

Let me emphasize that this data is (almost) free of survivor-ship and 
look-ahead biases.  You are getting the data as SIP released it back in 
the day.  So companies around in 2003 that are not are in the data set.  
The data only has data that was available at the time of the release, so 
there is no look-ahead problem.  I added "almost" to cover 2 caveats.  
As an example, if you use the first install (end of 2002) to figure out 
companies with P/E's less than X in 2001, you've got a survivor-ship 
bias problem.  The SIP data is available pretty much at month end, but 
you won't be able to trade at month-end.  If you assume that you can, 
you have a look-ahead bias.

Weekly data is available beginning in 2005.

I hope this helps.  If you use SIP and find data errors, I'd like to 
know about them.

	[[alternative HTML version deleted]]



More information about the R-SIG-Finance mailing list