[R] RSiteSearch for words ``as one entity''.

Henrik Bengtsson hb at stat.berkeley.edu
Thu Sep 11 00:33:31 CEST 2008


On Wed, Sep 10, 2008 at 3:09 PM, Marc Schwartz
<marc_schwartz at comcast.net> wrote:
> on 09/10/2008 04:49 PM Rolf Turner wrote:
>>
>> On 11/09/2008, at 9:29 AM, Marc Schwartz wrote:
>>
>>> on 09/10/2008 04:03 PM Rolf Turner wrote:
>>>>
>>>> I tried to search for a string of words ``as one entity'' following the
>>>> example in the help file:
>>>>
>>>>> RSiteSearch("{logistic regression}")
>>>>
>>>> and got the error message:
>>>>
>>>> 2008-09-11 08:55:41.356 open[823] No such file:
>>>> /Users/rturner/http:/search.r-project.org/cgi-bin/namazu.cgi?query={logistic+regression}&max=20&result=normal&sort=score&idxname=Rhelp02a&idxname=functions&idxname=docs
>>>>
>>>
>>> For some reason, it appears to be looking locally, rather than to the
>>> net.
>>
>>     The question is, how do I *stop it* from looking locally? :-)
>>>
>>> On my system:
>>>
>>> R version 2.7.2 Patched (2008-08-25 r46438)
>>>
>>> running on F9:
>>>
>>>> RSiteSearch("{logistic regression}")
>>> A search query has been submitted to http://search.r-project.org
>>> The results page should open in your browser shortly
>>>
>>> and then it takes me to the search engine page.
>>>
>>> Does this only happen with the above incantation using the braces, or
>>> does it happen without them as well?
>>
>>     It only happens with the incantation using the braces; without
>>     the braces, RSiteSearch() works like a dream.
>>
>>> Even if I directly use browseURL(), which is the relevant function in
>>> RSiteSearch(), I cannot get it to behave in the way you have above, even
>>> if I only use a single slash in the http:/ part of the URL. Somehow it
>>> is paste()ing your $HOME folder as a prefix.
>>>
>>> You might want to try it with R --vanilla, just to see if something is
>>> amiss in your R session.
>>
>>     Tried that just now --- same error occurred.
>>>
>>> You might also want to post to the sig-mac list in case this is Mac
>>> specific.
>>
>>     Probably is ... but one shouldn't cross-post, they say.
>>     And I thought I'd get more feed-back from the general R-help list.
>>
>> Thanks for your suggestions.
>
>
> As Prof. Ripley noted, try using debug(RSiteSearch) and then step
> through the code to see where $HOME is being added to the URL variable.
> That can help pin it down further. If the URL is not corrupted by the
> time browseURL() is called, then use debug(browseURL) to trace within
> that function to see if something is happening there.

Yes, on key to solve this is to see what is passed to browseURL().

May I bet you that nothing is added and the problem is that the URL
end up with a single slash?  I believe that the browser (or whatever
tries to open the URL) that 1) does not identify a <protocol>://
prefix in the URL, 2) instead concludes that the URL specifies a local
file, and 3) tries to open it as if it had protocol file:// prefix.
If changing setwd(), changes the prefix, I think that is what happens.

>
> There are several places where regex based manipulations are taking
> place within those functions and the use of the braces may be causing a
> problem somewhere along the way in those manipulations on your system.
> However, why that would be limited to OS X or your system specifically
> (eg. a locale issue?) is not clear to me.

Also, from the specification of 'Uniform Resource Locators (URL)'
[http://www.ietf.org/rfc/rfc1738.txt], one can read in Section 2.2:
"... only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL."  Note also that curly brackets are listed
under the "Unsafe:" paragraph.   Thus, one could indeed argue that it
a bug that RSiteSearch() or browseURL() does not encode the URL
correctly.

I have a poor man URL encoder in R.utils called toUrl().  Try to see
the 2nd call below works:

url <- "http://search.r-project.org/cgi-bin/namazu.cgi?query={logistic+regression}";
browseURL(url);

library("R.utils");
browseURL(toUrl(url));

If the 1st fail and the 2nd works that is also a clue.

That's definitely my $.02

/Henrik

>
> HTH,
>
> Marc
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list