[R] Trouble with HTML search engine

Damon Wischik djw1005 at cam.ac.uk
Thu Apr 22 15:43:12 CEST 2004


There have been a number of posts to this list by people having trouble
with the HTML search engine. Often these troubles are caused by incorrect
setups (user hasn't installed Java properly, or Java is disabled, or
Javascript is disabled). Sometimes the trouble persists even when Java is
installed properly.

I have written an alternative HTML search engine, which is based on
Javascript rather than Java. (Hopefully, this means that there is less
that can go wrong.) I haven't wrapped it up in a package yet, because I
don't know how well it works or even if there is any interest. If you
would like to try it, you can:

source("http://www.statslab.cam.ac.uk/~djw1005/Stats/Interests/search.R")
helpHTML()

(Or you can download the code and run it locally. You will also need to
download searchtemplate.html from the same location. You can then run
helpHTML(searchTemplate=localfilename) to tell it to use your local copy
of searchtemplate.html.)

ISSUES WITH MY CODE
-------------------
I've tested it on Windows XP cross {R1.8.0,R1.9.0} cross {IE,Firebird}.
I've also tested it on Debian 3.0 with R1.8.0 cross {Mozilla,Firebird}.
I'd be grateful to learn whether it works elsewhere.

Searching is a bit slow. On my newish computer it takes three seconds or
so. On an older departmental machine it takes ten seconds. Such is
(Javascript) life.

My searching algorithm is not the same as the current searching algorithm.
I wrote this for my own use, and so I've used a scoring mechanism which
reflects the way I like to search. The text at the top of the search page
explains some of the options.

There are obviously things I don't understand about the current help
setup. If anyone is sufficiently interested in this to explain them to me,
I would be grateful. (1) On my Windows XP setup, R writes an index in the
directory it was installed. What if it doesn't have write permission?  (2)
On my Debian setup, R copies all of the HTML help into a temporary
directory. Why not just refer to the files where they are, rather than
copying them all across? Because I don't understand these two points, I've
written my indexing routines to (a) create a search index in a temporary
directory, and (b) refer to the files in their install directories. My
indexing routines run the same under both Windows and Linux.


ISSUES WITH R ------------- 
The R "Installation and administration"  document tells us that "Sun's
Java Run-time Environment j2re 1.4.2_02 does not work under Linux". Prof
Ripley said on 1 April 2004 that "if Linux/Unix, Sun JRE 1.4.2_02/3/4 are
broken"  This is news to those like myself who run Linux with Sun's JRE
1.4.2_03 and find that all their other applets work fine. (Though I'm sure
there are bugs in the JRE, as in most complex projects.)

On my computer, the trouble boiled down to this: the Javascript which
displays search results was unable to interface with the Java applet which
performed the search. As far as I am aware, there are no published
standards which govern this interface. Therefore it is necessary to rely
on vendor documentation (insofar as we can say that organizations which
distribute free software are vendors). In the case of Mozilla, this
interface is called LiveConnect; some documentation is available at
mozilla.org. Generally speaking, an object on a web page may export
certain methods, making them available to Javascript. For example, an
object which contains a Java applet typically makes available the static
methods of the classes in that applet. Again, I am not aware of any
published standards on which methods are exported, so again we have to
rely on vendor documentation. In the case of Sun's Java, the documentation
explains how to use these exported methods
http://java.sun.com/j2se/1.4.2/docs/guide/plugin/developer_guide/js_java.html
However, the R HTML search page does not follow this documentation. I
found that if I alter the R HTML search page to conform to this
documentation, it works.

It is always going to be difficult to write portable code when there are
no published standards, only vendor-specific documentation. I have
therefore attempted, in my Javascript search, to stick to pure ECMAscript
(though undoubtedly I have failed in places).

Damon.




More information about the R-help mailing list