[R] Text Mining with Facebook Reviews (XML and FQL)

Duncan Temple Lang duncan at wald.ucdavis.edu
Tue Oct 11 17:37:55 CEST 2011


Hi Kenneth

  First off, you probably don't need to use xmlParseDoc(), but rather
  xmlParse().  (Both are fine, but xmlParseDoc() allows you to control many of
  the options in the libxml2 parser, which you don't need here.)

  xmlParse() has some capabilities to fetch the content of URLs. However,
 it cannot deal with HTTPS requests which this call to facebook is.
 The approach to this is to
    i) make the request
   ii) parse the resulting string via xmlParse(txt, asText = TRUE)

 As for i), there are several ways to do this, but the RCurl
 package allows you to do it entirely within R and gives you
 more control over the request than you would ever want.

   library(RCurl)
   txt = getForm('https://api.facebook.com/method/fql.query', query = QUERY)

   mydata.xml = xmlParse(txt, asText = TRUE)

However, you are most likely going to have to "login" / get a token
before you make this request. And then, if you are using RCurl,
you will want to use the same curl object with the token or cookies, etc.

    D.

On 10/10/11 3:52 PM, Kenneth Zhang wrote:
> Hello,
> 
> I am trying to use XML package to download Facebook reviews in the following
> way:
> 
> require(XML)
> mydata.vectors <- character(0)
> Qword <- URLencode('#IBM')
> QUERY <- paste('SELECT review_id, message, rating from review where message
> LIKE %',Qword,'%',sep='')
> Facebook_url =  paste('https://api.facebook.com/method/fql.query?query=
> ',QUERY,sep='')
> mydata.xml <- xmlParseDoc(Facebook_url, asText=F)
> mydata.vector <- xpathSApply(mydata.xml, '//s:entry/s:title', xmlValue,
> namespaces =c('s'='http://www.w3.org/2005/Atom'))
> 
> The mydata.xml is NULL therefore no further step can be execute. I am not so
> familiar with XML or FQL. Any suggestion will be appreciated. Thank you!
> 
> Best regards,
> Kenneth
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list