[Rd] extracting tables from web pages?

Rui Barradas ruipbarradas at sapo.pt
Thu Apr 25 23:36:22 CEST 2013


Hello,

The following seems to work.

library(XML)

url <- "http://house.gov/representatives/"
dat <- readHTMLTable(readLines(url), which=1, header=TRUE)
str(dat)
head(dat)


Hope this helps,

Rui Barradas

Em 25-04-2013 21:00, Spencer Graves escreveu:
> Hello:
>
>
>        What tools would you recommend for extracting the table of
> members of the US House of representatives from
> "http://house.gov/representatives/" and
> "http://en.wikipedia.org/wiki/List_of_current_members_of_the_United_States_House_of_Representatives_by_age"?
>
>
>
>        I started writing something using getURL{RCurl}.  However, I'm
> getting bogged down manually selecting character sequences to search for
> and split on.
>
>
>        Thanks,
>        Spencer Graves
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list