[R] Web Scraping
Ista Zahn
istazahn at gmail.com
Sat Oct 5 03:58:33 CEST 2013
Hi,
I have a short demo at https://gist.github.com/izahn/5785265 that
might get you started.
Best,
Ista
On Fri, Oct 4, 2013 at 12:51 PM, Mohamed Anany
<melsayed at students.kennesaw.edu> wrote:
> Hello everybody,
> I just started using R and I'm presenting a poster for R day at Kennesaw
> State University and I really need some help in terms of web scraping.
> I'm trying to extract used cars data from www.cars.com to include the
> mileage, year, model, make, price, CARFAX availability and Technology
> package availability. I've done some research, and everything points to the
> XML package and RCurl package. I also got my hands on a function that would
> capture all the text in the web page and store as a huge character vector.
> I've never done data mining before so when i read the help documents on the
> packages i mentioned earlier is like reading Chinese. I would appreciate it
> if you guide me through this process of data extraction.
> Here's an example of what the data would look like:
>
> Cost Year Mileage Tech CARFAX Make Model
> $32000 1999 57,987 1 FREE Audi A4
>
> Here's the link to the search:-
> http://www.cars.com/for-sale/searchresults.action?stkTyp=U&tracktype=usedcc&mkId=20049&AmbMkId=20049&AmbMkNm=Audi&make=Audi&AmbMdNm=A4&model=A4&mdId=20596&AmbMdId=20596&rd=100&zc=30062&searchSource=QUICK_FORM&enableSeo=1
>
> I'm not expecting you to write the whole code for me, but just some
> guidance and where to start and what functions would be useful in my
> situation.
> Thanks a lot anyway.
>
> Regards,
> M. Samir Anany
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list