[R] Extract Data form Website Tables
Jennifer Young
jennifer.a.m.young at gmail.com
Sat Mar 22 18:17:41 CET 2014
Hi Doran
I'm also trying to scrape the leaderboard data. Did you happen to figure
out how to extract the athlete's team/affiliate? Trying to do a bit of code
to figure out which teams will qualify when individuals are removed.
On Sunday, March 2, 2014 2:34:21 PM UTC-5, Doran, Harold wrote:
>
> This is fantastic, thank you. I¹ve modified the code to loop through all
> the pages and grab all rows of the HTML table.
>
> Thank you, Rui.
>
>
>
> On 3/2/14, 5:08 AM, "Rui Barradas" <ruipba... at sapo.pt <javascript:>>
> wrote:
>
> >Hello,
> >
> >Maybe something like the following.
> >
> >#install.packages("XML", dep = TRUE)
> >
> >library(XML)
> >
> >url <-
> >"
> http://games.crossfit.com/scores/leaderboard.php?stage=1&sort=0&division=
> >1®ion=0&numberperpage=60&page=0&competition=0&frontpage=0&expanded=0&fu
>
> >ll=1&year=14&showtoggles=0&hidedropdowns=0&showathleteac=1&athletename="
> >data <- readHTMLTable(readLines(url), which=1, header=TRUE)
> >
> >names(data) <- gsub("\\n", "", names(data))
> >names(data) <- gsub(" +", "", names(data))
> >
> >data[] <- lapply(data, function(x) gsub("\\n", "", x))
> >
> >str(data)
> >
> >
> >Hope this helps,
> >
> >Rui Barradas
> >
> >Em 01-03-2014 23:47, Doran, Harold escreveu:
> >> There is a website that populates a table with athlete scores during a
> >>competition. I would like to be able to extract those scores from the
> >>website and place them into a data frame if this is possible. The
> >>website is at the link below:
> >>
> >> http://games.crossfit.com/leaderboard
> >>
> >> One complication is that one must manually click through multiple pages
> >>as the table only populates a few hundred rows on one web page. In
> >>looking at the source code of the website, I think I can go to here and
> >>maybe grab scores, but I am not sure if R can someone read them in from
> >>this and populate a data frame and subsequently grab data from every
> >>page.
> >>
> >>
> >>
> http://games.crossfit.com/scores/leaderboard.php?loadfromcookies=1&number
> >>perpage=60&full=1&showathleteac=1<view-source:
> http://games.crossfit.com/s
> >>cores/leaderboard.php?loadfromcookies=1&numberperpage=60&full=1&showathle
>
> >>teac=1>
> >>
> >> I have not done anything like this before, and so any guidance is
> >>appreciated.
> >>
> >>
> >> Thank you
> >>
> >> Harold
> >>
> >>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-h... at r-project.org <javascript:> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
>
> ______________________________________________
> R-h... at r-project.org <javascript:> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list