[R] two questions for R beginner
Paul Hiemstra
p.hiemstra at geo.uu.nl
Tue Mar 2 15:01:05 CET 2010
Brandon Zicha wrote:
> Hey Paul,
Hey Brandon, (adding R-help in the cc)
I agree with you that the documentation of R could be better, especially
with more examples in code showing not only the common cases, but also
more esoteric cases. It would be great if everyone invested a lot of
time to write awesome documentation, but this is not the case. I just
objected to the tone (I tought :)) I spotted. Some more comments are inline:
>
> Accepting the main point of my post - that the often VERY incomplete
> help files appended to packages can be a major stumbling block for
> getting up and running in R - I take your point. I probably went a
> bit to far with my language there.
>
> I would point out though that a great many parts of research (like
> writing a bibliography - or searching for citations of any kind
> usually) aren't much fun, but are an important part of research
> related work. Likewise, complete documentation (by which I hardly
> mean a paper - looking at STATA help files as a minimum would be a
> good start) is part of programming. I agree that one needs to employ
> some level of judgement, otherwise you will get helpfile that says
> "First turn on the computer... then click the 'R' Icon...." But, I
> have myself created one or two STATA functions that I have put up for
> public use - so I know how not fun, but necessary complete
> documentation is. Further, I didn't say that writing documentation
> doesn't take time. Everything takes time. My point was that relative
> to actually creating the application - writing more complete
> documentation takes very little time. If one invests the time to do
> the 'fun' stuff of writing a new package for R, it seems reasonable
> that taking the (proportionately) little time to write a nicer help
> file would be the most 'professional' thing to do. But, this could be
> my illusion that all researchers seem themselves as professionals -
> rather than an anarchic egoistic enclave of independent
> self-interested paper producers.
This is what scientists get judged upon, not on how much software they
publish and how good their documentation is. Furthermore, it is quite
hard for a hardcore R programmer to judge what people find har about
their software.
> I am notorious for assuming greater standards as an acceptable 'norm'
> than my community at large :-) Furthermore, you are absolutely right
> that my standards are apparently even to high for many commercial
> applications! R help is sometimes downright good!
>
> So, if I accept that I am demanding S.O.B. and tone down my thoughts
> of proper documentation and professionalism and adopt the (probably
> more) reasonable perspective you do at the end of "well, this is the
> world we live in... and come on it's free" I totally agree that I
> probably went too far! But, better yet, I think that this observation
> you make suggests a solution: Perhaps R could use a more integrated
> and organized open source help system. I can think of a few
> possibilities - the easiest being a wiki version of R help. This way
> users could add useful information to help files - such as more
> examples, tricks, tips, and known problems. This would take advantage
> of the open source, free, user-community centered aspects of R, and
> permit those with an interest in helping beginners to post notes for
> beginners - on the help files. I know that if such a wiki existed I
> would have posted my recent example of constrain optimization I just
> did recently. It wouldn't be too difficult to add a function
> wikihelp(X) that would open the wiki help page rather than the
> standard help documentation. Currently, help on any given command is
> scattered all over help fora all about the web. A central, indexed,
> and easily referenced help system might be a solution. Heck, such a
> system could go a step further and link R-help listserv archives by
> command thus centralizing and integrating the open-source user-built
> information resource of the listserv into help(). How many e-mails to
> this listserv begin with 'I just spent a few hours cruising the help
> forums related to 'X' and couldn't find an answer.'
Sounds like a good addition, allowing people to add to the documentation
as they see fit. There is ofcourse the R wiki, but this is not widely
used and not firmly embedded into R itself. But how would we keep such a
system you propose manageable, preventing it from becoming an enormous
mess. Maybe some kind of moderation?
>
> I note that STATA has all their help files for the latest version of
> stata available on the web (http://www.stata.com/help.cgi?contents).
> How difficult would a similar system - only with R, editable and with
> links to supplementary information - be to set up? I can't imagine it
> would be horribly expensive in terms of set up costs.
A problem is that there is no company that markets R that could set this
up, the community is much looser, much more open source. Probably the R
core team would be the closest thing we have.
>
> What do you think?
>
> Best,
>
> Brandon Z
>
>
> On Mar 2, 2010, at 1:16 PM, Paul Hiemstra wrote:
>
>> Brandon Zicha wrote:
>>>>>> What were your biggest misconceptions or
>>>>>> stumbling blocks to getting up and running
>>>>>> with R?
>>>
>>> Easy. I terms of materials I have been unable to find good books
>>> that introduce users to R from the perspective of someone familiar
>>> only with packages like SPSS or STATA, or not familiar with
>>> statistics packages at all. Even introduction texts use jargon
>>> without introducing it.
>>>
>>> I think that R-help files should be more thorough than they are, and
>>> contain more examples. I thought that STATA help files were
>>> sparse! The notion that 'R is a user community and thus they do
>>> this in their spare time' is no excuse for those creating new tools
>>> for R not developing complete help files. It doesn't take that much
>>> time relative to actually creating the new function.
>> Hi Brandon,
>>
>> I would disagree with your point that documentation doesn't take much
>> time. Writing documentation that is suitable for both the advanced
>> user (being a reference, and thus preferably short) and the beginning
>> user (being sort of a tutorial, and thus prefererably longer) is
>> quite a challenge, comparable to writing a good paper. Apart from the
>> fact that it takes quite a while, it is also not much fun. Often
>> people develop packages for their own research and put the software
>> online so others can benefit, they don;t need the documentation
>> themselves and don't get paid to write the documentation.
>>
>> So saying 'it's no excuse' really goes too far in my view. R is free,
>> you did not pay several thousands of euros giving you the right for
>> good support. Even the support is free through the mailing list. You
>> can get a paid version of R at Revelution Computing. Then you can
>> call them if there are problems. I'm not meaning to offend anybody,
>> but I didn't agree with "is no excuse for those creating new tools
>> for R not developing complete help files". Partly the strength of R
>> is in the open source, but sometimes, as with documentation, this can
>> bite you. But I think the R docs aren't that bad, I've seen
>> proprietary software that a worse job than R.
>>
>> my 2euro on the subject :),
>>
>> Cheers,
>> Paul
>>>
>>> In terms of actual R use - creating, using, and manipulating data
>>> are the biggest frustration for those of the 'spreadsheet
>>> generation'. I get the impression that one needs to not merely
>>> understand, but be fully fluent in the jargon of matrix mathematics
>>> to even know what is going on half the time. I find myself - even
>>> now - using 'rules of thumb' that 'seemed to work' rather than fully
>>> understanding what I am doing. It is particularly discouraging when
>>> many of those 'intro books' suggest using something besides R for
>>> data manipulation - how clumsy is that!?
>>>
>>> I find the actual programming syntax itself is the easiest part to
>>> master. It is certainly more flexible - but without a particularly
>>> sufficient increase in complexity - than trying to write script in
>>> SPSS and STATA.
>>>
>>> Brandon Zicha
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> --
>> Drs. Paul Hiemstra
>> Department of Physical Geography
>> Faculty of Geosciences
>> University of Utrecht
>> Heidelberglaan 2
>> P.O. Box 80.115
>> 3508 TC Utrecht
>> Phone: +3130 274 3113 Mon-Tue
>> Phone: +3130 253 5773 Wed-Fri
>> http://intamap.geo.uu.nl/~paul
>>
>
--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: +3130 274 3113 Mon-Tue
Phone: +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul
More information about the R-help
mailing list