[Rd] Suggestions for packages / help / index (long mail)

Eric Lecoutre lecoutre at stat.ucl.ac.be
Wed Nov 24 13:40:24 CET 2004


Hi R-users and developers,

This month may have seen one of the biggest thread never seen on R-related 
mailing lists, the one about "GPL software" and "hidden costs" (at this 
day, thread is still open - and active!).
Lot's of mails in this thread are not really relevant to the original mail, 
send by Philippe Grosjean.
Nevertheless, most of the mails are of interest and one of my conclusions 
was that there is a real need in "help/index relating" stuff.
I have spent some times thinking about it. As everybody, I end up with: 
"this is not an easy problem at all" and "what we have *is* still very 
great". Indeed!
What you will find now is a sketch of thoughts/proposals . I tend to think 
some of those proposals are "low-cost" and could improve the life of R 
beginners.

First, I have to say I will put myself in the situation of a really 
beginner (say a first classes student):
A user who has practiced for some years will find easier to crawl all the 
rich available material. His experiment will help him find easily the 
package relevant to his problem, the function, has learned to use 
help.search() and so on. And he will wisely use R-help, following the 
guideline.
On the contrary, a beginneR will have more and more difficulties entering R 
world, as this one is constantly growing (leading to the famous supposed 
"hidden costs"). Appropriate poweR is not easy, specially if your daily 
task is specialized: you will have difficulties digging into all material 
to find those nuggets that will help you (and thanks to the community, 
there are so many nuggets... it may be hard to choose between gold or platine).

What we have for now is a document listing keywords. Advanced user will 
know those keywords are to be used by package maintainer, feeding the help 
system building chain.
This keyword database is very pertinent. It's content, which has been 
inherited in part from S, has previously beeing carefuly worked out. And 
that works well (try help.search("graphs") will provide you very 
interesting stuff - provided you have some packages installed...). I think 
that this keywords list may even have more uses.

1. As R community growths, it may be time to add some terms in this 
keywords list. Think about SciViews bundle on which Philippe is working. 
Most package in it are linked to GUI-stuff. Wouldn't the keyword GUI be 
useful? It could be worth offering for one month to the community the 
ability to suggest new entries (I am also thinking about econometry stuff). 
Then, R core team would choose if candidates are eligible or not.

2. DESCRIPTION files for packages may have a new field: keywords, allowing 
the author to add keywords to it's package (minimum one).


Here are some things we could end up:

package		keyword(s)		
---------------------------------------------
abind		Basics, manip, array
accuracy	Statistics
acepack		Statistics, regression
adapt		Mathematics
ade4		multivariate
...


3. Package keywords could be used to propose "automatic" bundles and/or 
lists of package (consider for that keywords as categories). Thus, CRAN 
sites could have a listing of all packages, but also a listing of all 
packages related to Mathematics, to multivariate (statistics) and so on. 
And one could propose to install a whole bunch of packages at one time. 
Thus (and provided the existence of adequate keywords), the beginner 
interested in multivariate statistics would easily install his R with 
adequate starting package. Same for econometrics, geostatistics, and any 
other field of application.

4. What would really be useful then (I think) is a sort of PACKAGES_INDEX 
that would come with R. Explanation: one package index would be it's 
keywords (with a high weight) plus all it's functins and their associated 
keywords functions (lower weights). When downloading and installing the 
newest R, there would be an flat text file containg that (not so so ...so 
big). We could also add a function that will refresh this file.

5. Then, we could update "help.search", that would begin to list 
information on "installed packages" PLUS potentially suggest other packages 
available on CRAN.

6. Final point has already been discussed in the past. It is about misc 
packages and pieces of code. I propose the creation of 5 packages:
	- miscGraphics (keywords: misc, Graphics)
	- miscStatistics (keywords: misc, Statistics)
	- miscMathematics (keywords: misc, Mathematics)
	- miscBasics (keywords: misc, Basics)
	- miscProgramming (keywords: misc, Programming)
With what I proposed before, they would be accessible as a bunch selecting 
package for categroy "misc" and each would also be listed in it's category 
("Graphics",...).
Each of those package would have a maintainer and a new mailing list (say 
R-misc) could be set up to talk about pieces of code that could enter such 
or such package. Yes, I am volonteer to maintain one of those.



There is some work here for all 6 points, but not so much. What is great is 
that we already have most of the necessary stuff. And we only use KEYWORDS 
file...
Please let me know what you think about those suggestions. If there is 
interest, I may ask for others volonteers to set one or more of those 
suggestions.

Eric

Eric Lecoutre
UCL /  Institut de Statistique
Voie du Roman Pays, 20
1348 Louvain-la-Neuve
Belgium

tel: (+32)(0)10473050
lecoutre at stat.ucl.ac.be
http://www.stat.ucl.ac.be/ISpersonnel/lecoutre

If the statistics are boring, then you've got the wrong numbers. -Edward 
Tufte



More information about the R-devel mailing list