[Rd] R html help system [Was: How to document man/*.Rd pages with images?]

Mon May 16 20:15:36 CEST 2011

On 5/13/11 8:20 PM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:

> 
> On May 13, 2011, at 7:08 PM, Sean Robert McGuffee wrote:
> 
>> 
>> On 5/12/11 9:13 AM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:
>> 
>>> I just want to clarify the mechanics of the help system when using html.
>>> 
>>> R has a built-in HTTP server (aka Rhttpd) which transforms HTTP requests to
>>> function calls. It is not your usual web server, because it doesn't map URL
>>> paths to files, it just allows R functions to do anything with it --
>>> something
>>> like CGI except that we are talking about functions and not files. Therefore
>>> you won't find any files and there is no file structure involved.
>>> 
>>> For the help system, the function handling the requests is tools:::httpd() -
>>> you can look at what it does.
>> 
>> Awesome, will do!
>> 
>>> Basically, it generates pages according to the
>>> various paths it supports. As part of its path handling it allows certain
>>> paths to reference files, e.g.
>>> /library/myPackage/doc/randomStuffInPackagesDocDirectory.html will read the
>>> file doc/randomStuffInPackagesDocDirectory.html in myPackage.
>>> 
>>> Note that the whole point of the dynamic help *is* to generate content on
>>> the
>>> fly, because the content depends on the state of your current workspace --
>>> packages loaded, classes defined, etc. so you cannot pre-generate pages as
>>> they won't have correct links - that's why we shifted from static html pages
>>> to the dynamic ones.
>> 
>> This is interesting. It actually speaks to a constructive criticism I have of
>> R. As a user of R, I don't want to have conditional dynamic help. I want to
>> always get the same answer to a question so to say. If there is a way to do
>> something that works and is reproducible, then I don't want to maybe or maybe
>> not get that answer. Thus, I guess what I'm thinking is that there should
>> maybe be selection within help that has organization based on packages loaded
>> and classes defined, but I would hope that the state of my system doesn't
>> change the help that is displayed. At least I as one user would prefer to see
>> all of my options will all packages and not have any of it emphasized or
>> excluded by how my system is currently set up.
>> 
> 
> I don't think I follow you. Your options will be different, by definition,
> depending on which packages you have installed and loaded. As one obvious
> example, you can't refer to documentation of packages that you did not
> install. As another example (more future-directed I suspect), the generics and
> methods depend on your currently defined classes, methods etc. That is true in
> R itself, so I don't see why you would prefer documentation that doesn't match
> your R.

I guess what I mean to say is that although a user's options depend on
context, when I use help, I want to know what my options are outside of the
current context. Especially at a stage where I need help, I definitely might
not have the right context set up, so it's very important to me to have
non-context-specific help. I recognize that this is my opinion and by no
means agreed upon, but this was a huge barrier for me when I first began
using R. For example, sometimes I would load a library while following an
example in the documentation before I even knew about libraries or realized
what I was doing. Then, within that context, a help command within that
library would work. On another day, when I hadn't set up that context by
accident, using my notes of "help(whatever)" wouldn't work. That was very
confusing and frustrating for me, and I'm much more computer literate than
many of the test-tube wet lab scientists I would hope to have using my
packages. I realize that "??whatever" gets around this, but the "??" symbol
is hard to lookup and learn about, especially when there is a
"help(whatever)" syntax that worked on a previous day. It took me months to
understand that "??" even existed as something separate from
"help(whatever)." In general, as a beginner, I found it very confusing that
there were more than one help commands, and I'm not sure if that's a good
thing for people who need help. Those are my two cents anyway. This might be
best for most people--I tend to be a very literal and concrete thinker and
may not be very representative. On the other hand, I don't see what the
benefit is to having less options for help if a package isn't loaded. I
suppose it provides more focus for someone who knows very well what they are
doing and wants to know something specific to their current context.
However, I'm not sure if helping that person focus is as critical to
providing help as assisting a new user who is likely to be oblivious to
contexts as a concept. Am I making any sense?

Let me make up an extreme example, just to clarify:
Suppose there are packages A and B, both with command C.
Regardless of what packages I have loaded, I would want help( C ) to bring
up a list with two files to choose from: one corresponding to A::C and one
corresponding to B::C, both containing the information that to use C one
must first type library(A) or library(B) to create the context that makes
the help information about how to use C relevant. My understanding is that
as it stands, typing help( C ) would do one of three things, depending on
whether library(A) or library(B) has been loaded. If neither has been
loaded, help( C ) would produce an error. If library(A) had been loaded,
then help would be generated regarding A::C and the user would be otherwise
oblivious to B::C, even if that were the useful info to the user. Likewise
for the vis-a-versa case with library(B) loaded. So that's to help me
explain my perspective of what I mean by non-context-specific help and why I
think it would be advantageous. Does that make sense? What I don't
understand is a case where it is advantageous to be oblivious to out of
context options. Could someone show me an example of that? I'm sure I am
missing something because I can't think of a case like that on my own and
have been considering it for days.

> 
> 
>> In the satus quo, I can see how the choice of which pages to look at is
>> dynamic if more than one comes up on a search, but it seems inefficient to me
>> to have the page itself be dynamic. I think it would be a good idea if
>> package authors could at least have an option to have their help pages
>> produced as files either way.
> 
> That decision is left to the user - you can use --html to generate html pages.

This is appealing to me, but I can't seem to find any info about it. Maybe
having the "--" part of the "--help" is throwing off my searches. I couldn't
find "--help" when I searched R's help. If my package is named MyPackage,
how would a user generate the html pages from it with "--html"?

> 
> 
>> I mean, when my package will be loaded, I certainly won't want options and do
>> want to be able to point my users to an unconditional file location to point
>> their browser to.
>> 
> 
> You can do that with dynamically generated pages (you can't do that with
> static pages in fact) - the paths are well defined (unlike in your file
> system). Even better is fact, because the dynamic help is smart enough to find
> packages in different libraries, for example.

I think this is a very good thing. Having the dynamic help is probably the
very best way to go from that perspective. I personally find file-system
paths to be annoyingly less well defined, so I'm completely sold by what you
are saying. It seems from what you said below that the performance issue I
have is from a bug. Am I right about that? I could be mistaking the context
of what you said below because I was a bit ambiguous about more than one
issue below. Since I can control what my package generates for help my
context-specific issue is not a problem for me, so a this is a distinct
issue I have with dynamic help and what appeared to me to be a performance
problem. My machine tends to lag for quite a while on many packages when
using the dynamic help system. Almost enough to discourage use. I found my
case with a large image to be a particularly bad anecdotal example of a
specific help problem, and maybe it is relevant to that more general issue
if there is a general bug. If this is caused by a bug, I would see no
problem with the concept of dynamic help at all--especially if it can react
instantaneously. However, I'm not completely convinced of that yet. It seem
very efficient to me to have a dynamic system point to pre-generated files.
However, generating those files dynamically, especially if many of them are
relevant, seems potentially inefficient. I don't claim to understand how it
is working yet, although looking into it has been fascinating so far.

> 
> 
>>> I'm curious about "If it¹s a large picture this process nearly crashes my
>>> machine when trying to access the file via help" - do you have an example
>>> package that would illustrate the problem?
>> 
>> I¹ve tried to recreate the problem with a small fake package, and although
>> it passes the check it doesn¹t seem to work quite right on my system. I
>> might have some compiler issues or configuration issues though, so it might
>> work as is on your system. If not, I think you could quickly find the
>> relevant parts though and add them to a package of your own to see the bug
>> if this doesn¹t work as is on your machine. I¹m not really sure why this
>> doesn¹t work on my machine. I did almost exactly the same thing as in the
>> huge package that I can¹t fit on my file transfer site. However, it is set
>> up to only install in 64-bit and I couldn't remember how I set that up. So
>> it might be the 32-bit part that is messing things up on my system. I think
>> there should be a simple way to declare an architecture in a package
>> DESCRIPTION or something. I can't remember. Anyway, that's beside the point.
>> Here is an example of a syntax and image file that makes my help go
>> extremely slow and not show images:
>> 
>> http://ftsext.mskcc.org/FileExchange/FileList.aspx?id=9afb4fe1-ce1c-406d-b1a
>> 1-c9360493137c
>> 
>> Please let me know if you see why this isn¹t working for me--both as to if
>> this works as is on your system and as to if this causes the bug. At the
>> moment this tells me " Error in gzfile(file, "rb") : cannot open the
>> connection" even though it passes all the build check install tests on my
>> machine.
>> 
> 
> There seem to be two bugs AFAICS:
> 
> a) The path generated from the URL is either wrong or something is not in
> syncs - it says
> /Library/Frameworks/R.framework/Versions/2.13/Resources/library/Meta/Rd.rdsBug
> /help
> but the meta file is really in
> /Library/Frameworks/R.framework/Versions/2.13/Resources/library/helpBug/Meta/R
> d.rds
> It seems like some strange permutation issue - but I didn't look at httpd()
> yet (I'm a bit puzzled as of why it doesn't affect other packages - maybe it's
> some regexp thing ...).
> 
> b) there seems to be an issue with WebKit and Rhttpd interaction in that
> Rhttpd gets blocked by WebKit not fetching the data. If you look at the page
> from an external browser, all is well. This will be a bit tricky to address
> and will need modifications to R...

Should I look into making these modifications to R? Or would this type of
thing be addressed by more official R personnel?

> 
> Cheers,
> Simon
> 
> 
>>> 
>>> Thanks,
>>> Simon
>>> 
>>> 
>>> On May 11, 2011, at 7:14 PM, Sean Robert McGuffee wrote:
>>> 
>>>> Thanks everyone for your help,
>>>> 
>>>> To summarize a resolution to my issue, it turns out that an image can be
>>>> include in a documentation file via html by putting an image file in the
>>>> inst/doc directory, for example inst/doc/myPic.png, and then pointing to it
>>>> in the man/myHelpPage.Rd file, for example as follows:
>>>> 
>>>> \if{html}{
>>>> \out{<img src="../doc/myPic.png" alt="image ../doc/myPic.png should be
>>>> here"/>}
>>>> }\ifelse{latex}{}{}
>>>> 
>>>> Note, this doesn¹t mean that R¹s help browser will view those images inside
>>>> the properly generated html help files.
>>>> Also, note that without the \out{} part, the text of the <img .../>  line
>>>> would show up instead of the html commands.
>>>> 
>>>> I have some concerns incase anyone on the list is interested. If it¹s a
>>>> large
>>>> picture this process nearly crashes my machine when trying to access the
>>>> file
>>>> via help‹and I¹m sure there must be some bug in that. I should note that
>>>> the
>>>> picture won¹t actually display within R¹s help console (at least on my
>>>> machine--I¹m on a mac with a binary version of R). To see that the html
>>>> files
>>>> are created properly, I have to copy a link to the help file and then point
>>>> an actual browser such as firefox to the help file to see the page with the
>>>> image. I¹m not sure how R is running httpd or how that interacts with help.
>>>> I¹m not even sure about the basics of help. Is there a way to configure R
>>>> to
>>>> use an actual web browser by default instead of it¹s slow one that doesn¹t
>>>> show images? It would also be nice if there were an address bar on R¹s help
>>>> browser. I mean, until I put a link to my help file inside another help
>>>> file,
>>>> there was no way for me to even get it¹s address to copy and paste into
>>>> firefox. It would also be nice if it didn¹t almost crash and let me more
>>>> easily get the link, but ideally it would be best not to have a
>>>> semi-functional help browser. Furthermore, this brings up the point that I
>>>> can¹t find the files I¹m browsing with the link. In this case, I get a link
>>>> such as: 
>>>> http://127.0.0.1:23269/library/MyPackage/html/MyPackage.html
>>>> But I can¹t find the MyPackage.html file anywhere on my computer. It¹s
>>>> there
>>>> in the web browser, but seems to be only in existence via R¹s httpd without
>>>> actually existing on my file system.
>>>> Is it there and I can¹t find it or is it encoded in R somehow? If it is
>>>> there, where would it be? If I close R, I no longer have access to the page
>>>> that R¹s httpd is serving. It seems to me that it¹s being created every
>>>> time
>>>> I use help‹and I think that is extremely inefficient. I think firefox can
>>>> handle file-type urls, so I if there is a way to get R to both generate
>>>> these
>>>> files and use firefox to browse them for help, I would very much like to
>>>> know
>>>> more about it. It would be much faster and useful than the status quo on my
>>>> machine if this file were generated once at installation and remained as a
>>>> file--and and using help simply pointed a web-browser to the file.
>>>> Anyway, I suppose this is a tangent. The main point is that there is a way
>>>> to
>>>> provide help documentation with images‹but even though it tries to view
>>>> them
>>>> correctly via help‹R¹s help browser displays broken images so I have the
>>>> awkward need to copy and paste links into other web browsers.
>>>> 
>>>> Regarding some feedback I¹ve gotten about some user¹s interests in help
>>>> formatted as text, I think there are two things in this process that keep a
>>>> text help user on track: (1) the conditional html part and (2) even if
>>>> using
>>>> a textual html browser, <img ... alt=²alternate text²/> take care of
>>>> displaying images as text. I think though that the other way around, the
>>>> users who require images in their help files are having less functionality
>>>> via help in R. At least in this case, the best I could do was get R to
>>>> generate the proper help pages in html, but R¹s default html help browser
>>>> (at
>>>> least on my machine) doesn¹t display the images (although they are there
>>>> and
>>>> can be displayed by the same link in firefox).
>>>> Sometimes it¹s true what they say about a picture being worth a thousand
>>>> words‹I think in general this is true for complex things that need computer
>>>> power to deal with, so I hope R can eventually support images in help files
>>>> due to the usefulness of doing so in some cases.
>>>> 
>>>> Thanks again,
>>>> 
>>>> Sean
>>> 
>> 
>> 
>> 
>