[R] string parsing

Sam Steingold sds at gnu.org
Tue Feb 15 23:20:11 CET 2011


I am trying to get stock metadata from Yahoo finance (or maybe there is
a better source?)
here is what I did so far:

yahoo.url <- "http://finance.yahoo.com/d/quotes.csv?f=j1jka2&s=";
stocks <- c("IBM","NOIZ","MSFT","LNN","C","BODY","F"); # just some samples
socket <- url(paste(yahoo.url,sep="",paste(stocks,collapse="+")),open="r");
data <- read.csv(socket, header = FALSE);
close(socket);
data is now:
       V1     V2     V3        V4
1  200.5B 116.00 166.25   4965150
2   19.1M   3.75   5.47      8521
3  226.6B  22.73  31.58  57127000
4  886.4M  30.80  74.54    226690
5  142.4B   3.21   5.15 541804992
6  276.4M  11.98  21.30    149656
7 55.823B   9.75  18.97  89369000

now I need to do this:

--> convert 55.823B to 55e9 and 19.1M to 19e6

parse.num <- function (s) { as.numeric(gsub("M$","e6",gsub("B$","e9",s))); }
data[1]<-lapply(data[1],parse.num);

seems like awfully inefficient (two regexp substitutions),
is there a better way?

--> iterate over stocks & data at the same time and put the results into
a hash table:
for (i in 1:length(stocks)) cache[[stocks[i]]] <- data[i,];

I do get the right results,
but I am wondering if I am doing it "the right R way".
E.g., the hash table value is a data frame.
A structure(record?) seems more appropriate.

thanks!

-- 
Sam Steingold (http://sds.podval.org/) on CentOS release 5.3 (Final)
http://pmw.org.il http://ffii.org http://camera.org http://honestreporting.com
http://iris.org.il http://mideasttruth.com http://thereligionofpeace.com
I haven't lost my mind -- it's backed up on tape somewhere.



More information about the R-help mailing list