[Rd] Suggestions for packages / help / index (long mail)
Eric Lecoutre
lecoutre at stat.ucl.ac.be
Wed Nov 24 13:40:24 CET 2004
Hi R-users and developers,
This month may have seen one of the biggest thread never seen on R-related
mailing lists, the one about "GPL software" and "hidden costs" (at this
day, thread is still open - and active!).
Lot's of mails in this thread are not really relevant to the original mail,
send by Philippe Grosjean.
Nevertheless, most of the mails are of interest and one of my conclusions
was that there is a real need in "help/index relating" stuff.
I have spent some times thinking about it. As everybody, I end up with:
"this is not an easy problem at all" and "what we have *is* still very
great". Indeed!
What you will find now is a sketch of thoughts/proposals . I tend to think
some of those proposals are "low-cost" and could improve the life of R
beginners.
First, I have to say I will put myself in the situation of a really
beginner (say a first classes student):
A user who has practiced for some years will find easier to crawl all the
rich available material. His experiment will help him find easily the
package relevant to his problem, the function, has learned to use
help.search() and so on. And he will wisely use R-help, following the
guideline.
On the contrary, a beginneR will have more and more difficulties entering R
world, as this one is constantly growing (leading to the famous supposed
"hidden costs"). Appropriate poweR is not easy, specially if your daily
task is specialized: you will have difficulties digging into all material
to find those nuggets that will help you (and thanks to the community,
there are so many nuggets... it may be hard to choose between gold or platine).
What we have for now is a document listing keywords. Advanced user will
know those keywords are to be used by package maintainer, feeding the help
system building chain.
This keyword database is very pertinent. It's content, which has been
inherited in part from S, has previously beeing carefuly worked out. And
that works well (try help.search("graphs") will provide you very
interesting stuff - provided you have some packages installed...). I think
that this keywords list may even have more uses.
1. As R community growths, it may be time to add some terms in this
keywords list. Think about SciViews bundle on which Philippe is working.
Most package in it are linked to GUI-stuff. Wouldn't the keyword GUI be
useful? It could be worth offering for one month to the community the
ability to suggest new entries (I am also thinking about econometry stuff).
Then, R core team would choose if candidates are eligible or not.
2. DESCRIPTION files for packages may have a new field: keywords, allowing
the author to add keywords to it's package (minimum one).
Here are some things we could end up:
package keyword(s)
---------------------------------------------
abind Basics, manip, array
accuracy Statistics
acepack Statistics, regression
adapt Mathematics
ade4 multivariate
...
3. Package keywords could be used to propose "automatic" bundles and/or
lists of package (consider for that keywords as categories). Thus, CRAN
sites could have a listing of all packages, but also a listing of all
packages related to Mathematics, to multivariate (statistics) and so on.
And one could propose to install a whole bunch of packages at one time.
Thus (and provided the existence of adequate keywords), the beginner
interested in multivariate statistics would easily install his R with
adequate starting package. Same for econometrics, geostatistics, and any
other field of application.
4. What would really be useful then (I think) is a sort of PACKAGES_INDEX
that would come with R. Explanation: one package index would be it's
keywords (with a high weight) plus all it's functins and their associated
keywords functions (lower weights). When downloading and installing the
newest R, there would be an flat text file containg that (not so so ...so
big). We could also add a function that will refresh this file.
5. Then, we could update "help.search", that would begin to list
information on "installed packages" PLUS potentially suggest other packages
available on CRAN.
6. Final point has already been discussed in the past. It is about misc
packages and pieces of code. I propose the creation of 5 packages:
- miscGraphics (keywords: misc, Graphics)
- miscStatistics (keywords: misc, Statistics)
- miscMathematics (keywords: misc, Mathematics)
- miscBasics (keywords: misc, Basics)
- miscProgramming (keywords: misc, Programming)
With what I proposed before, they would be accessible as a bunch selecting
package for categroy "misc" and each would also be listed in it's category
("Graphics",...).
Each of those package would have a maintainer and a new mailing list (say
R-misc) could be set up to talk about pieces of code that could enter such
or such package. Yes, I am volonteer to maintain one of those.
There is some work here for all 6 points, but not so much. What is great is
that we already have most of the necessary stuff. And we only use KEYWORDS
file...
Please let me know what you think about those suggestions. If there is
interest, I may ask for others volonteers to set one or more of those
suggestions.
Eric
Eric Lecoutre
UCL / Institut de Statistique
Voie du Roman Pays, 20
1348 Louvain-la-Neuve
Belgium
tel: (+32)(0)10473050
lecoutre at stat.ucl.ac.be
http://www.stat.ucl.ac.be/ISpersonnel/lecoutre
If the statistics are boring, then you've got the wrong numbers. -Edward
Tufte
More information about the R-devel
mailing list