[R] RCurl : limit of downloaded Urls ?

Duncan Temple Lang duncan at wald.ucdavis.edu
Sun Jan 31 15:29:55 CET 2010



Alexis-Michel Mugabushaka wrote:
>  Dear Rexperts,
> 
> I am using R to query google.

I believe that Google would much prefer that you use their API
rather than their regular HTML form to make programmatica search queries.

> 
> I am getting different results (in size) for manual queries and queries sent
> through "getForm" of RCurl.
> 
> It seems that RCurl limits the size of the text retrieved (the maximum I
> could get is around 32 k bits).

                          _bytes_   I assume

> 

zz = getForm("http://www.google.com/search", q='google+search+api', num = 100)
nchar(zz)
[1] 109760

So more than 3 times 32Kb and there isn't a limit of 32K.

The results will most likely be "chunked", i.e. returned in blocks,
but getForm() and other functions will, by default, combine the chunks
and return the entire answer. If you were to provide your own function
for the writefunction option in RCurl functions, then your
function will be called for each chunk.

So to be able to figure out why things are not working for you,
we need to see the R code  you are using, and know the operating
system and versions of the RCurl package and R.

 D.


> Any idea how to get around this ?
> 
> Thanks in advance
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list