From maechler at stat.math.ethz.ch Tue Apr 1 11:48:02 1997 From: maechler at stat.math.ethz.ch (Martin Maechler) Date: Tue, 1 Apr 97 11:48:02 +0200 Subject: "R-announce", "R-help", "R-devel" : 3 mailing lists for R Message-ID: <9704010948.AA00412@> Upon proposal by Robert Gentleman, and given the ``immediate'' release of R 0.50 beta (instead of alpha), I have created three mailing lists concerned with R where the 2nd one, "R-help", is the replacement for the current "R-testers". For a while, "r-testers" will be kept as synonymous to "r-help". The 3 mailing lists are 1) R-announce : Only announcement of new versions / important patches 2) R-help : Questions / Answers about ("released versions of") R 3) R-devel : The alpha-/ pre-testers list of "R-hackers". People who get the newest release, try out patches,.... (maybe not much more than the R 0.50 prerelease testers) where 1) is gatewayed to 2) i.e. all "R-announce" is forwarded to "R-help". The intent is that "R-announce" would have less than one message per day, typically rather only a few messages per month. Note that I am sending this to "R-announce" and everyone who has been on the "R-testers" list is getting this e-mail since everything that is sent to "r-announce" is automatically forwarded to "r-help". - ------- AGAIN: For most of you nothing changes, you'll get all the announcements and everything from "R-help". However, if you are only interested in important announcements and new releases, you should - unsubscribe from "R-help" - subscribe to "R-announce" ((by sending unsubscribe to R-help-request at stat.math.ethz.ch subscribe to R-announce-request at stat.math.ethz.ch in the message "body", not as "subject ..)) PS. Yes, this is the first post to any of these lists, so there may be problems, even though I don't hope so. PPS. In spite of the date, this is no "April's fool" joke.... Martin Maechler <>< Seminar fuer Statistik, SOL F5 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1086 http://www.stat.math.ethz.ch/~maechler/ From Kurt.Hornik at ci.tuwien.ac.at Wed Apr 23 08:40:28 1997 From: Kurt.Hornik at ci.tuwien.ac.at (Kurt Hornik) Date: Wed, 23 Apr 1997 08:40:28 +0200 Subject: ANNOUNCE: CRAN Message-ID: <199704230640.IAA03837@aragorn.ci.tuwien.ac.at> This is the first announcement of the Comprehensive R Archive Network (CRAN) CRAN is a collection of sites which carry identical material, consisting of the R&R R distribution(s), the contributed extensions, documentation for R, and binaries. The CRAN master site can be found at the URL ftp://ftp.ci.tuwien.ac.at/pub/R (Austria) and is currently being mirrored daily at http://lib.stat.cmu.edu/R/CRAN (U.S.A.) ftp://franz.stat.wisc.edu/pub/R (U.S.A.) ftp://ftp.stat.math.ethz.ch/R-CRAN (Switzerland) This list should grow within soon. If you want to become an official CRAN mirror, please send me a note (Kurt.Hornik at ci.tuwien.ac.at). Please use the CRAN site closest to you to reduce network load. The structure of the CRAN tree is as follows. src/base # Source distribution src/contrib # Source for extensions doc/ # Documentation bin/ # Binaries `src/base' contains the official R source distribution as provided by Ross Ihaka and Robert Gentleman. `src/contrib' contains code for extension packages. Currently, there are acepack, bootstrap, ctest, date, e1071, fracdiff, gee, jpn, oz, snns, splines, and survival4. Look at the INDEX file in this directory for more specific information. More packages are expected for the near future. `bin' is for prebuilt R binaries (the base distribution and extensions), grouped according to platforms. Currently, there are only experimental packages for Debian GNU/Linux. I hope that `.tar.gz' files with contents relative to an installation tree (e.g. `bin/', `lib/R/', and `man/man1/R.1') can be made available soon for all major supported Unix platforms. `doc' is for additional documentation and information on R. In the short run, the process of `submitting' to CRAN is very simple: upload to ftp://ftp.ci.tuwien.ac.at/incoming and drop me a note (Kurt.Hornik at ci.tuwien.ac.at). Please indicate the copyright situation (GPL, ...) in your submission. In the long run, there will be a form to fill in, and some requirement of authentication (PGPish, ...), and submission could maybe be done via WWW. I am open to suggestions here. ***************************************************************************** * Kurt Hornik * * * Dept of Statistics TU Wien * tel: +43 (1) 58801-4542 * * Wiedner Hauptstr 8-10/1071 * fax: +43 (1) 504-1498 * * A-1040 Wien * email: Kurt.Hornik at ci.tuwien.ac.at * * Austria * WWW: http://www.ci.tuwien.ac.at/~hornik * ***************************************************************************** =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From ihaka at stat.auckland.ac.nz Wed Apr 23 11:25:23 1997 From: ihaka at stat.auckland.ac.nz (Ross Ihaka) Date: Wed, 23 Apr 1997 21:25:23 +1200 (NZST) Subject: Version 0.49 Released Message-ID: <199704230925.VAA24453@stat1.stat.auckland.ac.nz> The newest version of R for Unix (version 0.49) is now available (or soon will be) from the following sites. NORTH AMERICA: http://lib.stat.cmu.edu/R/Alpha EUROPE: ftp://ftp.stat.math.ethz.ch/R/ ftp://statlab.uni-heidelberg.de/pub/mirrors/auckland/R/ JAPAN: ftp://ftp.u-aizu.ac.jp/pub/lang/R/ NEW ZEALAND: ftp://stat.auckland.ac.nz/pub/R/ Please obtain a copy from site close to you. Note that New Zealand is not close to anywhere other than itself :-). This version adds considerable functionality and (we hope) stability to R. Most notably, the object system is now very close to that of S and we now have a fairly full implementation of complex arithmetic. The jump in version number (from 0.16) relects the fact that we feel that this version of R represents quite a jump toward what we hope will be in our eventual 1.0 release. However the version is numbered 0.49 rather than the 0.50 we touted because it falls a little short of what we really want for 0.50. Immediate development of R will focus on creating a coherent way of loading and unloading "libraries" and creating a good framework for documentation (can you say "SGML" - I knew you could). A (partial) list of changes from 0.16.1 follows. A list of known problems is kept in the file "TASKS" in the distribution and a list of problems we think are solved is in "TASKS.OLD" R + R ------------------------------------------------------------------- CHANGES IN VERSION R VERSION 0.49 ALPHA NEW FEATURES o The ``object'' system has been changed substantially. The behavior of both "UseMethod" and "NextMethod" should match that in S. Group methods for "Math", "Ops" and "Summary" are available. o Complex arithmetic is now implemented. Many mathematical functions are now defined for complex arguments (e.g. sqrt, exp, log, sin, cos, tan, asin, acos, atan). There is no complex gamma function or log gamma function yet. The summary functions "mean", "sum", "prod", "cumsum" and "cumprod" work correctly when (some of) their arguments are complex. Other functions such as "solve" are not "complex aware" yet, but do print warning messages about coercion of complex values to real by the dropping of imaginary parts. S and R do not return identical results in all cases: S> atan(tan(1i)) [1] 0-1i R> atan(tan(1i)) [1] 0+1i [ Is this just a difference on the branch cut boundary? ] o The full set of S graphics symbols is now available with pch=0:18. In addition, there is a special set of R plotting symbols which can be obtained with pch=29:25. pch=19 solid circle pch=20 bullet pch=21 circle pch=22 square pch=23 diamond pch=24 triangle point-up pch=25 triangle point down The symbols 21:25 can be colored and filled with different colors. For example, the expression points(x, y, pch=21, col="red", bg="yellow") will plot the points using a symbol consisting of a red circle with a yellow interior. o There is a new "family" argument to the postscript graphics driver which can be set to any of "AvantGarde", "Bookman", "Courier", "Helvetica", "Helvetica-Narrow", "NewCenturySchoolbook", "Palatino" or "Times". In addition, setting font=5 will cause the "Symbol" family to be used. This is still experimental and it is hard to see it being useful without some sort of math capability. [ Such a facility is "on the drawing board". ] o The graphics parameter "las" is now implemented and can be used to rotate axis labels. E.g. plot(1:10, las=1) . o The hyperbolic and inverse hyperbolic functions cosh, sinh, tanh, acosh, asinh and atanh are now implemented for both real and complex arguments. (Q: are the underlying functions available on all platforms, or do we need compatibility fixes?) o "log" has changed so that it will accept an optional "base" argument. "log2" and "log10" are implemented this way. o "atan" can now either be invoked as atan(x) or atan(x,y). o The behavior of "fft" has been modified to match that of S (i.e. it returns a complex value. There is also a function "mvfft" which performs a "vector transform" when passed a matrix (i.e. it applies the fft to each column, rather than doing a 2d spatial transform). o A new functon "polyroot" can be used to find the roots of polynomials with (real or) complex coefficients. o Vectors and lists are now "stretchy". This means that the following is legal x <- 1:10 x[20] <- 12 [ Note that there is a bug in S. When you try this kind of extension - the "dim" and "dimname" attributes are not dropped, leading to "invisible" elements in the result. ] o Symbolic differention is now available using the functions "D" and "deriv". The results are slightly difference in appearance from those of S (which tends to put in a few too many parentheses), but should provide identical semantics. To see the nature of the difference, try the expression D(expression(tan(x)/x^2),"x") in both systems. These functions are implemented as internal code in R (what's the point in having a nice little underlying lisp if you don't use it for obvious list processing applications?). o A new function "grep" has been implemented. It performs regular expression matching based on POSIX 1003.3. The function uses the "regex" library written by Henry Spencer (the same one that Perl uses). Grep is now used to provide a pattern matching facility in "objects" and "ls". In addition, there are functions "sub" and "gsub" which operate the same way as those in "nawk". Note that "\" must be escaped to get it into a string, so if you want a literal "\" you must type "\\\\" :-(. o A new "methods" function written by Martin Maechler replaces the older less sophisticated one. This has also been converted to use the new "grep" function. o A new version of model.frame from Thomas Lumley is included. o Factors and ordered factors are now "objects" with class attributes which match those in S. This change is primarily so that applications written for S will work in R. The underlying implementation for factors and ordered factors still uses special underlying types. o R will now do conditioning plots as described in the S "Models" book. Some thought is going into "doing trellis". o Thomas Lumley's "require" and "provide" functions for library organization added. o There is a new graphical parameter "gamma" which is designed to let users apply a ``gamma'' correction for their graphics displays. Most monitors produce a color intensity which is related to voltage by the equation intensity = voltage ^ gamma with gamma about 2.5 for most PC monitors. A typical symptom of this non-linearity is that a colorwheel produced by piechart(rep(1,48), col=48) shows a marked over representation of the red, green and blue and blue primaries. If this is the case try par(gamma=1/2.5) and redraw the color wheel. Vary gamma till you have a "nice" spectrum. This is experimental and feedback would be welcome. (If this is useful we will do the same for postscript). BUG FIXES & ENHANCEMENTS o The internals of the postscript device drivers have been rewritten in preparation for "doing equations". The Adobe font metrics are no longer processed during the build process. The original files are now processed directly by the driver. The postscript emitted by the drivers should conform to the Adobe 3.0 standard. o Typos and logic errors in "crossprod" pointed out by Arne Kovac have been fixed. o "round" now works using IEEE rounding on platforms which support it. E.g. > round(.5 + -3:7) [1] -2 -2 0 0 2 2 4 4 6 6 8 o A change to the PostScript graphics device driver change should stop plots from rotating unexpectedly when viewed with ghostview and other postscript viewers. o Patches from Kurt Hornik for "structure" and "help" have been applied. o "names" applied to 1-d arrays now does the "right thing" i.e. it returns the first component of the "dimnames" attribute rather than NULL. Subsetting a 1-d array as a vector will produce a "names" attribute on the result. o "atan" will now accept either 1 or 2 arguments ("atan2" still exists). o "nchar" and "format" preserve "dim" and "dimname" attributes where possible. o There were problems with arguments in "switch". These have been fixed. There appears to be a bug in S with the expression z <- switch(1, a=, b=10, 20) mode(z) They appear to return a "missing argument". We die with an error message. What is the right thing to do here? o Problems deparsing arguments of the form "a"= and a=, have been fixed. o The default graphics line width is now 1 rather than 0. o "as.is" problem in "read.table" pointed out by Peter Dalgaard fixed. o paste(list(character(0),""), 1:2) no longer causes a segfault. "as.character" applied to a list now returns the vector obtained by deparsing each element of the list. o "attach" now attaches objects at pos=2 when directed. o Memory usage was measured in 1000 byte chunks with the -v flag. This has been changed to 1024 byte chunks. o Dependency on the "OSF sprintf bug" has been removed. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From ihaka at stat.auckland.ac.nz Wed Apr 23 11:36:33 1997 From: ihaka at stat.auckland.ac.nz (Ross Ihaka) Date: Wed, 23 Apr 1997 21:36:33 +1200 (NZST) Subject: Version 0.49 Addendum Message-ID: <199704230936.VAA24493@stat1.stat.auckland.ac.nz> I should mention that this version of R has been verified to configure and compile on the following platforms: PLATFORM COMPILER alpha-dec-osf3.2 cc hppa1.1-hp-hpux9.07 gcc or c89 i386-unknown-freebsd2.1.5 gcc i686-unknown-linux gcc mips-sgi-irix6.2 cc sparc-sun-solaris2.5.1 gcc sparc-sun-sunos4.1.4 gcc We'd be interested in any other successes. Ross =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From Kurt.Hornik at ci.tuwien.ac.at Wed Apr 23 17:13:22 1997 From: Kurt.Hornik at ci.tuwien.ac.at (Kurt Hornik) Date: Wed, 23 Apr 1997 17:13:22 +0200 Subject: R-FAQ v0.1-0 Message-ID: <199704231513.RAA14629@aragorn.ci.tuwien.ac.at> A much updated R FAQ is now available at the URL http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html The plain text version is appended below. -k ***************************************************************************** * Kurt Hornik * * * Dept of Statistics TU Wien * tel: +43 (1) 58801-4542 * * Wiedner Hauptstr 8-10/1071 * fax: +43 (1) 504-1498 * * A-1040 Wien * email: Kurt.Hornik at ci.tuwien.ac.at * * Austria * WWW: http://www.ci.tuwien.ac.at/~hornik * ***************************************************************************** ***************************************************************************** R FAQ Kurt Hornik v0.1-0, 1997/04/23 This document contains answers to some of the most frequently asked questions about R. Feedback is welcome. ______________________________________________________________________ Table of Contents: 1. Introduction 1.1 Legalese 1.2 Obtaining this Document 1.3 Notation 1.4 Feedback 2. R Basics 2.1 What Is R? 2.2 What Machines Does R Run on? 2.3 What Is the Current Version of R? 2.4 How Can R Be Obtained? 2.5 How Can R Be Installed? 2.5.1 How Can R Be Installed (Unix) 2.5.2 How Can R Be Installed (Windows) 2.5.3 How Can R Be Installed (Macintosh) 2.6 Are there Unix Binaries for R? 2.7 Which Documentation Exists for R? 2.8 Which Mailing Lists Exist for R? 2.9 What is CRAN? 3. R and S 3.1 What Is S? 3.2 What Is S-PLUS? 3.3 What Are the Differences between R and S? 4. R Add-On Packages 4.1 Which Add-on Packages Exist for R? 4.2 How Can Add-on Packages Be Installed? 4.3 How Can Add-on Packages Be Used? 4.4 How Can I Contribute to R? 5. R and Emacs 5.1 Is there Emacs Support for R? 5.2 Should I Run R from Within Emacs? 6. R Miscellania 6.1 How Can I Read a Large Data Set into R? 6.2 Why Can't R Source a `Correct' File? 6.3 How Can I Set Components of a List to NULL? 6.4 How Can I Save My Workspace? 6.5 How Can I Clean Up My Workspace? 7. Acknowledgments ______________________________________________________________________ 11.. This document contains answers to some of the most frequently asked questions about R. 11..11.. This document is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. If you do not have a copy of the GNU General Public License, write to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. 11..22.. The latest version of this document is always available from http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html. From there, you can also obtain versions converted to plain ASCII text, GNU info, DVI, and PostScript, as well as the SGML source used for creating all these formats using the SGML-Tools (formerly Linuxdoc-SGML) system. (Note that right now cross-references don't work in the translation to plain ASCII.) 11..33.. Everything should be pretty standard. `R>' is used for the R prompt, and a `$' for the shell prompt (where applicable). 11..44.. Feedback is of course most welcome. In particular, note that I do not have access to Windows or Mac systems. If you have information on these systems that you think should be added to this document, please let me know. 22.. 22..11.. R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files. The design of R has been heavily influenced by two existing languages: Becker, Chambers & Wilks' S (see question ``What is S?'') and Sussman's Scheme. Whereas the resulting language is very similar in appearance to S, the underlying implementation and semantics are derived from Scheme. See question ``What Are the Differences between R and S?'' for a discussion of the differences between R and S. R is being developed by Ross Ihaka and Robert Gentleman, who are Senior Lecturers at the Department of Statistics of the University of Auckland in Auckland, New Zealand. R is free software distributed under a GNU-style copyleft. 22..22.. WWhhaatt MMaacchhiinneess DDooeess RR RRuunn oonn?? R is being developed for the Unix, Windows and Mac platforms. R will configure and build under a number of common Unix platforms including dec-alpha-osf, freebsd, hpux, linux-elf, sgi-irix, solaris, and sunos. If you know about other platforms, please drop me a note. 22..33.. WWhhaatt IIss tthhee CCuurrrreenntt VVeerrssiioonn ooff RR?? The current Unix version is 0.49, the previous one was 0.16.1. The jump in the version number reflects the fact that this version represents quite a jump towards what is hoped to be in the eventual 1.0 release. (However, it fell a little short of what was wanted for 0.50.) Version 0.49 added group methods and complex numbers, and hence more or less provides a full implementation of S as described in ``The New S Language''. The versions for Windows and Mac are pre-alpha. With some good luck, the Windows version will soon catch up with the Unix version. 22..44.. Sources, binaries and documentation for R can be obtained via CRAN, the ``Comprehensive R Archive Network'' (see question ``What is CRAN?''). 22..55.. HHooww CCaann RR BBee IInnssttaalllleedd?? 22..55..11.. If binaries are available for your platform (see question ``Are there Unix Binaries for R?''), you can use these, following the instructions that come with them. Otherwise, you can compile and install R yourself, which can be done very easily under a number of common Unix platforms (see question ``What Machines Does R Run on?''). The file INSTALL that comes with the R distribution contains instructions. Choose a place to install the R tree (R is not just a binary, but has additional data sets, help files, font metrics etc). Let's call this place RHOME (given appropriate permissions, a natural choice would be `/usr/local/lib/R'). Untar the source code, and issue the following commands (at the shell prompt): $ ./configure $ make $ make install-help You can also build a LaTeX version of the manual entries with $ make install-latex and an HTML version of the manual with $ make install-html If these commands execute successfully, the R binary will be copied to the `$RHOME/bin' directory. In addition, a shell script font-end called `R' will be created and copied to the same directory. You can copy this script to a place where users can invoke it, for example to `/usr/local/bin'. You could also copy the man page `R.1' to a place where your man reader finds it, such as `/usr/local/man/man1'. 22..55..22.. Get the file `Rexe.zip' from the `bin/ms-windows' directory of a CRAN site. This archive contains a binary Windows 3.xx distribution for R and installation instructions. Robert Gentleman has recently made an updated pre-alpha Windows executable file available for ftp at ftp://stat.auckland.ac.nz/pub/research/rgentlem/rbeta.zip. This binary should be more compatible with Windows 95 than the other (he does not know about 3.1). You still need all the other extra files from the previous Windows distribution, it is only an executable. 22..55..33.. CRAN sites have a directory `bin/macintosh' which contains `R.sea.hqx', a binhexed self-extracting archive, and installation instructions in `README.MACINTOSH'. 22..66.. AArree tthheerree UUnniixx BBiinnaarriieess ffoorr RR?? Experimental `.deb' packages for installation under Debian GNU/Linux can be found in `bin/ix86-linux'. No other binaries distributions for Unix systems have thus far been made publically available. 22..77.. WWhhiicchh DDooccuummeennttaattiioonn EExxiissttss ffoorr RR?? Currently, there is no R manual. Online documentation for most of the functions and variables in R exists, and can be printed on-screen by typing help(_n_a_m_e) (or ?_n_a_m_e) at the R prompt, where _n_a_m_e is the name of the R object help is sought for. (In the case of unary and binary operators and control-flow special forms, the name may need to be be quoted.) This documentation can also be made available as HTML, and as hardcopy via LaTeX, see question ``How Can R Be Installed?''. An up-to-date HTML version is always available for web browsing at http://www.stat.math.ethz.ch/R-manual In the absence of a systematic introduction to R, one can mostly get along with introductions to S or S-PLUS, such as ``Notes on S-PLUS: A Programming Environment for Data Analysis and Graphics'' by Bill Venables and David Smith . This document talks mostly about plain S features, and does not concentrate on features specific to S-PLUS, and is available from the Statlib S repository at http://lib.stat.cmu.edu/S/SplusNotes/ (LaTeX source and PostScript). An introduction to R based on it will soon be available. Last, but not least, Ross' and Robert's experience in designing and implementing R is described in: @Article{, author = {Ross Ihaka and Robert Gentleman}, title = {R: A Language for Data Analysis and Graphics}, journal = {Journal of Computational and Graphical Statistics}, year = 1996, volume = 5, number = 3, pages = {299--314} } This is also the reference for R to use in publications. 22..88.. Thanks to Martin Maechler , there are three mailing lists devoted to R. rr--aannnnoouunnccee This list is for announcements about the development of R and the availability of new code. rr--ddeevveell This list is for discussions about the future of R and pre- testing of new versions. It is meant for those who maintain an active position in the development of R. rr--hheellpp The `main' R mailing list, for announcements about the development of R and the availability of new code, questions and answers about problems and solutions using R, enhancements and patches to the source code and documentation of R, comparison and compatibility with S and S-plus, and for the posting of nice examples and, benchmarks. To send a message to everyone on the r-help mailing list, send email to r-help at stat.math.ethz.ch To subscribe (or unsubscribe) to this list send subscribe (or unsub- scribe) in the BODY of the message (not in the subject!) to r-help- request at stat.math.ethz.ch. Information about the list can be obtained by sending an email with info as its contens to r-help- request at stat.math.ethz.ch. Subscription and posting to the other lists is done analogously, with `r-help' replaced by `r-announce' and `r-devel', respectively. Note that the r-announce list is gatewayed into r-help, so you don't need to subscribe to both of them. It is recommended that you send mail to r-help rather than only to the R developers (who are also subscribed to the list, of course). This may save them precious time they can use for constantly improving R, and will typically also result in much quicker feedback for yourself. Of course, in the case of bug reports it would be very helpful to have code which reliably reproduces the problem. 22..99.. WWhhaatt iiss CCRRAANN?? The ``Comprehensive R Archive Network'' (CRAN) is a collection of sites which carry identical material, consisting of the R distribution(s), the contributed extensions, documentation for R, and binaries. The CRAN master site can be found at the URL ftp://ftp.ci.tuwien.ac.at/pub/R/ and is currently being mirrored daily at http://lib.stat.cmu.edu/R/CRAN/ ftp://franz.stat.wisc.edu/pub/R/ ftp://ftp.stat.math.ethz.ch/R-CRAN/ Please use the CRAN site closest to you to reduce network load. The structure of the CRAN tree is as follows. ``src/base' contains the official R distribution as provided by Ross Ihaka and Robert Gentleman. ``src/contrib' contains code for extension packages. ``doc' is for additional documentation and information on R. ``bin' is for prebuilt R binaries (the base distribution and extensions), grouped according to platforms. Currently, there are only experimental packages for Debian GNU/Linux. I hope that `.tar.gz' files with contents relative to an installation tree (e.g. `bin', `lib/R/', and `man/man1/R.1') can be made available soon for all major supported Unix platforms. The process of ``submitting'' to CRAN currently is very simple: upload to ftp://ftp.ci.tuwien.ac.at/incoming and send email to Kurt Hornik . Please indicate the copyright situation (GPL, ...) in your submission. 33.. 33..11.. WWhhaatt IIss SS?? S is a very high level language and an environment for data analysis and graphics. S was written by Richard A. Becker, John M. Chambers, and Allan R. Wilks of AT&T Bell Laboratories Statistics Research Department. The primary references for S are two books by the creators of S. +o Richard A. Becker, John M. Chambers and Allan R. Wilks (1988), ``The New S Language,'' Chapman & Hall, London. This book is often called the ``_B_l_u_e _B_o_o_k''. +o John M. Chambers and Trevor J. Hastie (1992), ``Statistical Models in S,'' Chapman & Hall, London. This is also called the ``_W_h_i_t_e _B_o_o_k''. There is a huge amount of user-contributed code for S, available at the S Repository at CMU. See the ``Frequently Asked Questions about S'' (http://lib.stat.cmu.edu/S/faq) for further information about S. 33..22.. WWhhaatt IIss SS--PPLLUUSS?? S-PLUS is a value-added version of S sold by Statistical Sciences, Inc. (now a division of Mathsoft, Inc.) S is a subset of S-PLUS, and hence anything which may be done in S may be done in S-PLUS. In addition S-PLUS has extended functionality in a wide variety areas, including robust regression, modern nonparametric regression, time series, survival analysis, multivariate analysis, classical statistical tests, quality control, and graphics drivers. Add-on modules add additional capabilities for wavelet analysis, spatial statistics, and design of experiments. See the MathSoft S-PLUS page (http://www.mathsoft.com/splus.html) for further information. 33..33.. WWhhaatt AArree tthhee DDiiffffeerreenncceess bbeettwweeeenn RR aanndd SS?? Whereas the developers of R have tried to stick to the S language as defined in ``The New S Language'' (Blue Book, see question ``What is S?''), they have adopted the evaluation model of Scheme. This difference becomes manifest when _f_r_e_e variables occur in a function. Free variables are those which are neither formal parameters (occurring in the argument list of the function) nor local variables (created by assigning to them in the body of the function). Whereas S (like C) by default uses _s_t_a_t_i_c scoping, R (like Scheme) has adopted _l_e_x_i_c_a_l scoping. This means the values of free variables are determined by a set of global variables in S, but in R by the bindings that were in effect at the time the function was created. Consider the following function: cube <- function(n) { sq <- function() n * n n * sq() } Under S, sq() does not ``know'' about the variable n unless it is defined globally: S> cube(2) Error in sq(): Object "n" not found Dumped S> n <- 3 S> cube(2) [1] 18 In R, the ``environment'' created when cube() was invoked is also looked in: R> cube(2) [1] 8 Lexical scoping allows using function closures and maintaining local state. A simple example (taken from Abelson and Sussman) can be found in the `demos/language' subdirectory of the R distribution. Further information is provided in the standard R reference ``R: A Language for Data Analysis and Graphics'' (see question ``Which Documentation Exists for R?'') and a paper on ``Lexical Scope and Statistical Computing'' by Robert Gentleman and Ross Ihaka which can be obtained from the `doc/misc' directory of a CRAN site. Lexical scoping also implies a further major difference. Whereas S stores all objects as separate files in a directory somewhere (usually `.Data' under the current directory), R does not. All objects in R are stored internally. When R is started up it grabs a very large piece of memory and uses it to store the objects. R performs its own memory management of this piece of memory. Having everything in memory is necessary because it is not really possible to externally maintain all relevant ``environments'' of symbol/value pairs. This difference also seems to make R _m_u_c_h _f_a_s_t_e_r than S. The down side is that if R crashes you will lose all the work for the current session. Saving and restoring the memory ``images'' (the functions and data stored in R's internal memory at any time) can be a bit slow, especially if they are big. In S this does not happen, because everything is saved in disk files and if you crash nothing is likely to happen to them. R is still in an alpha stage, and does crash from time to time. Hence, for important work you should consider saving often, see question ``How Can I Save My Workspace?'' (other possibilities are logging your sessions, or have your R commands stored in text files which can be read in using source()). (Note that if you run R from within Emacs (see question ``R and Emacs''), you can save the contents of the interaction buffer to a file and conveniently manipulate it using S-transcript-mode, as well as save source copies of all functions and data used.) Apart from lexical scoping and its implications, R follows the S language definition in the Blue Book as much as possible, and hence really is an ``implementation'' of S. There are some intentional differences where the behavior of S is considered ``not clean''. In general, the rationale is that R should help you detect programming errors, while at the same time being as compatible as possible with S. Some known differences are the following. +o In R, if x is a list, then x[sub] <- NULL and x[[sub]] <- NULL remove the specified elements from x. The first of these is incompatible with S, where it is a no-op. +o In S, the functions named .First and .Last in the `.Data' directory can be used for customizing, as they are executed at the very beginning and end of a session, respectively. R looks for files called `.Rprofile' in the user's home directory and the current directory, and sources these. (It also loads a saved image from `.RData' in case there is one.) If a .First function exists then, it is executed. The .Last mechanism is not supported yet. +o Attaching library sections works differently. In S, library(_n_a_m_e) adds the data directory for the library section _n_a_m_e to the search list. If a function object named `.First.lib' exists in the directory, it is executed; this is typically used to dynamically load compiled code required by the functions in the section. In R, library(_n_a_m_e) currently simply sources the file $RHOME/library/_n_a_m_e, and compiled code can be loaded by calling library.dynam() in this file. The .First.lib mechanism is not really supported. (Note that a library file is only loaded once, so that any code in the library that is not in a function is executed the first time the library is loaded.) +o R does not try as hard as S to preserve dimnames attributes (examples are apply, rbind, and cbind, but also arithmetic ops). +o R presently does not support IEEE Inf and NaN. +o In R, attach currently only works for lists and data frames (not for directories). +o Categories do not exist in R, and never will as they are deprecated now in S. Use factors instead. +o In R, For() loops are not necessary and hence not supported. +o In R, assign() uses the argument envir= rather than where= as in S. +o The random number generators are different, and the seeds have different length. +o The glm family objects are implemented differently in R and S. The same functionality is available but the components have different names. +o terms objects are stored differently. In S a terms object is an expression with attributes, in R it is a formula with attributes. The attributes have the same names but are mostly stored differently. The major difference in functionality is that a terms object is subscriptable in S but not in R. If you can't imagine why this would matter then you don't need to know. There are also differences which are not intentional, and result from missing or incorrect code in R. The developers would appreciate hearing about any deficiencies you may find (in a written report fully documenting the difference as you see it). Of course, it would be useful if you were to implement the change yourself and make sure it works. 44.. 44..11.. The R distribution comes with the following extra libraries: eeddaa Exploratory Data Analysis. Currently only contains functions for robust line fitting, and median polish and smoothing. mmvvaa Multivariate Analysis. Currently contains code for principal components (prcomp), canonical correlations (cancor), hierarchichal clustering (hclust), and metric multidimensional scaling (cmdscale). More functions for clustering and scaling, biplots, profile and star plots, and code for ``real'' discriminant analysis will be added soon. The following packages are available from the CRAN `src/contrib' area. aacceeppaacckk ace (Alternating Conditional Expectations) and avas (Additivity and VAriance Stabilization for regression) for selecting regression transformations. bboooottssttrraapp Software (bootstrap, cross-validation, jackknife), data and errata for the book ``An Introduction to the Bootstrap'' by B. Efron and R. Tibshirani, 1993, Chapman and Hall. cctteesstt A library of classical tests, including the Bartlett, Fisher, Kruskal-Wallis, Kolmogorov-Smirnov, and Wilcoxon tests. ddaattee Functions for dealing with dates. The most useful of them accepts a vector of input dates in any of the forms 8/30/53, 30Aug53, 30 August 1953, ..., August 30 53, or any mixture of these. ee11007711 Miscellaneous functions used at the Department of Statistics at TU Wien (E1071). ffrraaccddiiffff Maximum likelihood estimation of the parameters of a fractionally differenced ARIMA(p,d,q) model (Haslett and Raftery, Applied Statistics, 1989). ggeeee An implementation of the Liang/Zeger generalized estimating equation approach to GLMs for dependent data. jjppnn A function to plot Japan's coast-line and prefecture boundaries. oozz Functions for plotting Australia's coastline and state boundaries. ssnnnnss An R interface to the Stuttgart Neural Networks Simulator (SNNS). sspplliinneess Regression spline functions. ssuurrvviivvaall44 Functions for survival analysis (requires sspplliinneess). See CRAN `src/contrib/INDEX' for more information. Paul Gilbert has a written a multivariate time series library for S called ttiimmee..sseerriieess that is mostly converted to run in R. He will make this port generally available when complex numbers are implemented (see question ``What is the current version of R?''). According to Paul, the PADI interface from the Bank of Canada also works with minor changes. PADI can be used to access Fame time series data bases and potentially other databases, even remotely over the Internet. For further information see http://www.bank-banque- canada.ca/pgilbert. According to Arne Kovac , Guy Nason's WaveThresh package for S worked with only minor modifications under R version 0.12. More code has been posted to the r-help mailing list, and can be obtained from the mailing list archive. 44..22.. (Unix only.) Untar the add-on packages in $RHOME/src/library/ and type $ make libs $ cd ../.. $ ./etc/install-libhelp at the shell prompt. 44..33.. To find out which add-ons have already been installed, type R> library() at the R prompt. This produces something like NAME DESCRIPTION acepack ace() and avas() for selecting regression transformations bootstrap Functions for the book "An Introduction to the Bootstrap" ctest Classical Tests date Functions for handling dates eda Exploratory Data Analysis fracdiff Fractionally differenced ARIMA (p,d,q) models gee Generalized Estimating Equation models mva Classical Multivariate Analysis splines Regression spline functions survival4 Survival analysis [needs library(splines)] You can ``load'' an add-on with name _n_a_m_e by R> library(_n_a_m_e) You can then find out which functions it provides by typing R> help(library = _n_a_m_e) 44..44.. R is currently still in alpha (or pre-alpha) state, so simply using it and communicating problems is certainly of great value. One place where functionality is still missing is the modeling software as described in ``Statistical Models in S'' (see question ``What is S?''. The functions add1 kappa alias labels drop1 proj are missing; many of these are interpreted functions so anyone that is bored and wants to have a go at implementing them it would be appreci- ated. In addition, only linear and generalized linear models are cur- rently available, aov, gam, loess, tree, and the nonlinear modelling code are not there yet. Many of the packages available at the Statlib S Repository might be worth porting to R. If you are interested in working on any of these projects, please notify Kurt Hornik. 55.. RR aanndd EEmmaaccss 55..11.. IIss tthheerree EEmmaaccss SSuuppppoorrtt ffoorr RR?? There is an Emacs-Lisp interface to S/S-PLUS called S-mode. Its current version is 4.8 and can be obtained at http://www.maths.lancs.ac.uk:2080/~maa036/elisp/S-mode/. The earlier versions which can be found at the Statlib S repository (gnuemacs3 and gnuemacs4) are outdated. It contains code for interacting with an inferior S process from within Emacs including an interface to the help system, editing S source code, and transcript manipulation, and comes with detailed instructions for installation. Martin Maechler and Tony Rossini have integrated support for R into this package. The current version is at ftp://ftp.math.sc.edu/rossini/S-mode-4.8.MM6.XE2.tar.gz and runs under both GNU Emacs and XEmacs. To install, put the byte-compiled `.el' files into a place where Emacs can find them, and add (if (not (assoc "\\.R$" auto-mode-alist) (add-to-list 'auto-mode-alist (cons "\\.R$" 'R-mode)))) (autoload 'R "S" "Run an inferior R process" t) (autoload 'R-mode "S" "Mode for editing R source" t) (autoload 'r-mode "S" "Mode for editing R source" t) to one of your Emacs startup files, typically `~/.emacs'. You can then fire up R from within Emacs by typing `M-x R' (note however that many interface functions will not work), and if you use the extension `.R' for your files with R code, Emacs will automagically turn on R edit mode whenever you visit such a file. Tony Rossini, Martin Maechler and Kurt Hornik have officially taken over the development of S-mode. Version 4.9 (based on the current 4.8.MM series) will be released shortly; version 5.0 (codenamed ``istat'') should be out by the end of 1997. 55..22.. Yes. Inferior R mode provides a readline/history mechanism, object name completion, and syntax-based highlighting of the interaction buffer using Font Lock mode, as well as a very convenient interface to the R help system. Of course, it also integrates nicely with the mechanisms for editing R source using Emacs. One can write code in one Emacs buffer and send whole or parts of it for execution to R; this is helpful for both data analysis and programming. One can also seamlessly integrate with a revision control system, in order to maintain a log of changes in your programs and data, as well as to allow for the retrieval of past versions of the code. In addition, it allows you to keep a record of your session, which can also be used for error recovery through the use of the transcript mode. 66.. 66..11.. R (currently) uses a _s_t_a_t_i_c memory model. This means that when it starts up, it asks the operating system to reserve a fixed amount of memory for it. The size of this chunk cannot be changed subsequently. Hence, it can happen that not enough memory was allocated. In these cases, you should restart R with more memory available, using the command line options -n and -v. To understand these options, one needs to know that R maintains separate areas for fixed and variable sized objects. The first of these is allocated as an array of SEXPRECs assembled in a list using ``cons cells'' (ordered pairs each containing an element of the list and a pointer to the next cell), and the second as an array of VECRECs. The -n option can be used to specify the number of cons cells (each occupying 16 bytes) which R is to use (the default is 200000), and the -v option to specify the size of the vector heap in megabytes (the default is 2). Only integers are allowed for both options. E.g., to read in a table of 5000 observations on 40 numeric variables, R -v 6 should do. Note that the information where to find vectors and strings on the heap is stored using cons cells. Thus, it may also be necessary to allocate more space for cons cells in order to perform computations with very ``large'' variable-size objects. You can find out the current memory comsumption by typing gc() at the R prompt. 66..22.. R sometimes has problems parsing a file which does not end in a newline. This can happen for example when Emacs is used for editing the file and next-line-add-newlines is set to nil. To avoid the problem, either set require-final-newline to a non-nil value in one of your Emacs startup files, or make sure R-mode (see question ``Is there Emacs Support for R?'') is used for editing R source files (which locally ensures this setting). Earlier R versions had a similar problem when reading in data files, but this should have been taken care of now. 66..33.. You can use x[i] <- list(NULL) to set component i of the list x to NULL, similarly for named compo- nents. Do not set x[i] or x[[i]] to NULL, because this will remove the corresponding component from the list. For dropping the row names of a matrix x, it may be easier to use rownames(x) <- NULL, similarly for column names. 66..44.. HHooww CCaann II SSaavvee MMyy WWoorrkkssppaaccee?? The expression save(list = ls(), file = ".RData") saves the objects in the currently active environment (typically the user's .GlobalEnv) to the file `.RData' in the R startup directory. 66..55.. To remove all objects in the currently active environment (typically the user's .GlobalEnv), you can do rm(list = ls()) 77.. Of course, many many thanks to Robert and Ross for the R system, and to the package writers and porters for adding to it. Special thanks go to Peter Dalgaard, Paul Gilbert, Martin Maechler, and Anthony Rossini for their comments which helped me improve this FAQ. More to some soon ... =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From rgentlem at stat.auckland.ac.nz Sun Apr 27 05:21:15 1997 From: rgentlem at stat.auckland.ac.nz (Robert Gentleman) Date: Sun, 27 Apr 1997 15:21:15 +1200 (NZST) Subject: R for Windows95 Message-ID: <199704270321.PAA22468@stat13.stat.auckland.ac.nz> There is a zip (RApril.zip) which should now be at the mirrors in R/Windows. Please have a try and let me know of any problems. I haven't been able to test it on very many machines here so I really don't know about 3.11 or NT. robert =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From A.Kovac at Bristol.ac.uk Tue Apr 29 16:06:19 1997 From: A.Kovac at Bristol.ac.uk (Arne Kovac) Date: Tue, 29 Apr 1997 15:06:19 +0100 (BST) Subject: WaveThresh Message-ID: WaveThresh is now available from the CRAN. WaveThresh is a wavelet package which is used by many authors in the wavelet literature. A description of the functions can be found in Nason, G. P. and Silverman, B. W. (1994), "The discrete wavelet transform in S", Journal of Computational and Graphical Statistics, Vol. 3, pp 163-191. Another source of information is the home page for WaveThresh: http://www.stats.bris.ac.uk/pub/software/wavethresh/WaveThresh.html. The original S-Plus version is developed by Guy Nason and available from the Statlib archive. The R version supports nearly all the functionality except some restrictions with the use of the two-dimensional wavelet transform. A new version will be released soon and (hopefully) ported to R as well. Have fun Arne Kovac -- Arne Kovac School of Mathematics Phone: +44 (0117) 942 7551 University of Bristol A.Kovac at bristol.ac.uk University Walk, Bristol, BS8 1TW, U.K. http://www.stats.bris.ac.uk/~maak =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From hfe at math.uio.no Wed Apr 30 00:01:17 1997 From: hfe at math.uio.no (Harald Fekjaer) Date: Wed, 30 Apr 1997 00:01:17 +0200 Subject: New R package for survival analysis Message-ID: <33666FAD.181869A4@ulrik.uio.no> Dear people at r-announce At "http://www.med.uio.no/imb/stat/addreg/", you can now find the new package addreg (version 1.0). It performs additive hazards regression, which is an alternative (or supplement) to the Cox model. It results in plots that are informative regarding the effect of covariates on survival. I wrote it as a S-plus package, but has made some small adjustments to make it work on R. This can now be down loaded in a separte file. In version 1.1 of addreg (last quarter 1997?), I'll plan to give a full R version with the same functionality as the S-plus version. Harald Fekjaer, Section of Medical Statistice, University of Oslo =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From Friedrich.Leisch at ci.tuwien.ac.at Mon Jun 9 18:21:45 1997 From: Friedrich.Leisch at ci.tuwien.ac.at (Friedrich Leisch) Date: Mon, 9 Jun 1997 18:21:45 +0200 Subject: mlbench-0.1 --- machine learning benchmark problems Message-ID: <199706091621.SAA16397@galadriel.ci.tuwien.ac.at> I've made a package from some benchmark datasets for use with R and uploaded it to CRAN. Here's the Index entry: mlbench-0.1.tar.gz: A collection of artificial and real-world machine learning benchmark problems, including, e.g., the boston housing data from the UCI repository. Written/packaged by Fritz Leisch Original data sets from various sources. [1997/06/09] This (naturally) includes some data files ... hence you have to do some parts of the installation manually, see the INSTALL file for details (IMHO R has no clear concept for data: one flat directory is certainly not sufficient). R&R: Any news on this front? Best, Fritz -- ------------------------------------------------------------------- Friedrich Leisch Institut f?r Statistik Tel: (+43 1) 58801 4541 Technische Universit?t Wien Fax: (+43 1) 504 14 98 Wiedner Hauptstra?e 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch PGP public key http://www.ci.tuwien.ac.at/~leisch/pgp.key ------------------------------------------------------------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- From thomas at biostat.washington.edu Tue Jul 8 18:41:22 1997 From: thomas at biostat.washington.edu (Thomas Lumley) Date: Tue, 8 Jul 1997 09:41:22 -0700 (PDT) Subject: R-beta: integration, subset selection In-Reply-To: <199707030527.HAA05571@aragorn.ci.tuwien.ac.at> Message-ID: Two new libraries: leaps: replaces and improves on S leaps() function for regression subset selection. integrate: Integration by adaptive quadrature in 1-20 dimensions. Both available from http://www.biostat.washington.edu/~thomas/R.html and soon from CRAN Thomas Lumley ------------------------------------------------------+------ Biostatistics : "Never attribute to malice what : Uni of Washington : can be adequately explained by : Box 357232 : incompetence" - Hanlon's Razor : Seattle WA 98195-7232 : : ------------------------------------------------------------ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From Friedrich.Leisch at ci.tuwien.ac.at Thu Jul 10 15:10:35 1997 From: Friedrich.Leisch at ci.tuwien.ac.at (Friedrich Leisch) Date: Thu, 10 Jul 1997 15:10:35 +0200 Subject: New Packages in CRAN Message-ID: <199707101310.PAA03881@galadriel.ci.tuwien.ac.at> The following packages have been contributed to CRAN by Thomas Lumley: integrate-1.0.tar.gz: S function and supporting C and Fortran code for adaptive quadrature. The underlyling fortran code is purported to work in from 2 to 20 dimensions. S original by Michael Meyer (mikem at andrew.cmu.edu). R port by Thomas Lumley . [1997/07/10] leaps-1.0.tar.gz: This library performs an exhaustive search for the best subsets of a given set of potential regressors, using a branch-and-bound algorithm, and also performs searches using a number of less time-consuming techniques. It is designed to replace the "leaps()" command in S. Packaged for R by Thomas Lumley and based on FORTRAN77 code by Alan Miller . [1997/07/10] The tar files are available from the CRAN main site at ftp://ftp.ci.tuwien.ac.at/pub/R/ in the directory src/contrib and from the mirror sites (tomorrow): http://lib.stat.cmu.edu/R/CRAN/ (U.S./Pennsylvania) ftp://franz.stat.wisc.edu/pub/R/ (U.S./Wisconsin) ftp://ftp.stat.math.ethz.ch/R-CRAN/ (Switzerland) http://www.stat.unipg.it/pub/stat/statlib/R/CRAN/ (Italy) If you want your server being added to the list of mirrors, please let us know. Best, Fritz -- ------------------------------------------------------------------- Friedrich Leisch Institut f?r Statistik Tel: (+43 1) 58801 4541 Technische Universit?t Wien Fax: (+43 1) 504 14 98 Wiedner Hauptstra?e 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch PGP public key http://www.ci.tuwien.ac.at/~leisch/pgp.key ------------------------------------------------------------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- From takafumi at u-aizu.ac.jp Thu Jul 10 16:30:15 1997 From: takafumi at u-aizu.ac.jp (Takafumi Hayashi) Date: Thu, 10 Jul 1997 23:30:15 +0900 Subject: R-beta: New Packages in CRAN In-Reply-To: Your message of "Thu, 10 Jul 1997 15:10:35 +0200." <199707101310.PAA03881@galadriel.ci.tuwien.ac.at> Message-ID: <199707101430.XAA04737@tansei.u-aizu.ac.jp> Dear Sirs, > and from the mirror sites (tomorrow): > > http://lib.stat.cmu.edu/R/CRAN/ (U.S./Pennsylvani= > a) > ftp://franz.stat.wisc.edu/pub/R/ (U.S./Wisconsin) > ftp://ftp.stat.math.ethz.ch/R-CRAN/ (Switzerland) > http://www.stat.unipg.it/pub/stat/statlib/R/CRAN/ (Italy) We are mirroring CARN daily to our ftp-server. ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN (Japan) Best Regards, --- Takafumi Hayashi takafumi at u-aizu.ac.jp The University of Aizu phone : +81-242-37-2614 FCS Lab. fax : +81-242-37-2734 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From Friedrich.Leisch at ci.tuwien.ac.at Thu Jul 17 10:43:22 1997 From: Friedrich.Leisch at ci.tuwien.ac.at (Friedrich Leisch) Date: Thu, 17 Jul 1997 10:43:22 +0200 Subject: new version 0.1 of e1071 in CRAN Message-ID: <199707170843.KAA09296@galadriel.ci.tuwien.ac.at> The following functions have been added to the e1071 package skewness kurtosis hamming.distance The tar files are available from the CRAN main site at ftp://ftp.ci.tuwien.ac.at/pub/R/ in the directory src/contrib and from the mirror sites (tomorrow): http://lib.stat.cmu.edu/R/CRAN/ (U.S./Pennsylvania) ftp://franz.stat.wisc.edu/pub/R/ (U.S./Wisconsin) ftp://ftp.stat.math.ethz.ch/R-CRAN/ (Switzerland) http://www.stat.unipg.it/pub/stat/statlib/R/CRAN/ (Italy) ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN/ (Japan) If you want your server being added to the list of mirrors, please let us know. Best, Fritz -- ------------------------------------------------------------------- Friedrich Leisch Institut f?r Statistik Tel: (+43 1) 58801 4541 Technische Universit?t Wien Fax: (+43 1) 504 14 98 Wiedner Hauptstra?e 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch PGP public key http://www.ci.tuwien.ac.at/~leisch/pgp.key ------------------------------------------------------------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- From plummer at iarc.fr Tue Jul 29 17:30:11 1997 From: plummer at iarc.fr (Martyn Plummer) Date: Tue, 29 Jul 1997 17:30:11 +0200 Subject: RPM package for R Message-ID: <3.0.1.16.19970729173011.31175978@droopy.iarc.fr> A precompiled "RPM" binary of R-0.50.a1 is now available on CRAN for installation on Linux systems which use the RedHat Package Manager. The package also includes all contributed libraries currently in the CRAN archive. Source code is not included in the package, but the source RPM file is available on CRAN. The package is not PGP signed as PGP is illegal in France. This is the first public release of the RPM package and may have a few rough edges. Feedback is welcome. Martyn =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Return-Path: leisch at ci.tuwien.ac.at Received: from fangorn.ci.tuwien.ac.at (root at fangorn.ci.tuwien.ac.at [128.130.170.24]) by hypatia.math.ethz.ch (8.6.12/Main-STAT-mailer) with ESMTP id SAA13229 for ; Wed, 27 Aug 1997 18:38:48 +0200 Received: from galadriel.ci.tuwien.ac.at (leisch at galadriel.ci.tuwien.ac.at [128.130.170.37]) by fangorn.ci.tuwien.ac.at (8.8.5/8.8.4) with ESMTP id SAA18524 for ; Wed, 27 Aug 1997 18:38:19 +0200 Received: (from leisch at localhost) by galadriel.ci.tuwien.ac.at (8.8.5/8.7.3) id SAA00468; Wed, 27 Aug 1997 18:38:47 +0200 Date: Wed, 27 Aug 1997 18:38:47 +0200 Message-Id: <199708271638.SAA00468 at galadriel.ci.tuwien.ac.at> From: Friedrich Leisch MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 To: r-announce at stat.math.ethz.ch Subject: CRAN goes HTML X-Mailer: VM 6.33 under Emacs 19.34.1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from 8bit to quoted-printable by fangorn.ci.tuwien.ac.at id SAA18524 Hi all, thanks help by Thomas Lumley (who provided us also with a beautiful logo!) I've finally made a start for an html frontend for CRAN. It should be available on all mirror sites tomorrow. The master site is at http://www.ci.tuwien.ac.at/R and the usual mirror sites are http://www.stat.unipg.it/pub/stat/statlib/R/CRAN/ (IASC archive, Italy) ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN/ (University of Aizu, Japan) ftp://ftp.stat.math.ethz.ch/R-CRAN/ (ETH Z?rich, Switzerland) http://lib.stat.cmu.edu/R/CRAN/ (Statlib, CMU, USA) ftp://ftp.biostat.washington.edu/mirrors/R/CRAN/ (University of Washington, USA) ftp://franz.stat.wisc.edu/pub/R/ (University of Wisconsin, USA) Best, Fritz -- ------------------------------------------------------------------- Friedrich Leisch Institut f?r Statistik Tel: (+43 1) 58801 4541 Technische Universit?t Wien Fax: (+43 1) 504 14 98 Wiedner Hauptstra?e 8-10/1071 Friedrich.Leisch at ci.tuwien.ac.at A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch PGP public key http://www.ci.tuwien.ac.at/~leisch/pgp.key ------------------------------------------------------------------- From ihaka at stat.auckland.ac.nz Fri Dec 5 09:30:56 1997 From: ihaka at stat.auckland.ac.nz (Ross Ihaka) Date: Fri, 5 Dec 1997 21:30:56 +1300 (NZDT) Subject: New R Version for Unix Message-ID: <199712050830.VAA18123@stat1.stat.auckland.ac.nz> Version 0.60 of R for Unix is now available. Release of this version has been delayed for a variety of reasons, but we hope that it will provide a good deal more stability and functionality than previous versions. However it should be regarded as of "alpha" quality for a short shakedown period. We would like user feedback to help us improve the quality of R. You can do this by subscribing to the r-devel mailing list by sending a message containing subscribe (in the "body", not the subject!) to: r-devel-request at stat.math.ethz.ch Version 0.60 or R may be obtained from the following sites: TU Wein, Austria ftp://ftp.ci.tuwien.ac.at/pub/R IASC archive, Italy http://www.stat.unipg.it/pub/stat/statlib/R/CRAN/contents.html University of Aizu, Japan ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN/contents.html ETH Zurich, Switzerland ftp://ftp.stat.math.ethz.ch/R-CRAN/contents.html Statlib, Carnegie Mellon University, USA http://lib.stat.cmu.edu/R/CRAN/contents.html University of Washington, USA ftp://ftp.biostat.washington.edu/mirrors/R/CRAN/contents.html University of Wisconsin, USA http://franz.stat.wisc.edu/pub/R/contents.html There are some large changes to R with this version (additional large changes are planned for further versions). In particular, the documentation format has changed and the general standard of documentation improved a good deal. Perl version 5 is now required to process the documentation, but pre-formatted versions will be available. R is now being developed by a larger group than its original authors. In particular, this release represents the direct efforts of the following group: Peter Dalgaard, Robert Gentleman, Kurt Hornik, Ross Ihaka, Thomas Lumley, Friedrich Leisch, Martin Maechler, Paul Murrell, Heiner Schwarte and Luke Tierney. In addition, hundreds of contributions, in the form of code, patches and bug reports, have been made by a much larger group of indiviuals. R is now a GNU project and will be making changes to meet GNU coding and installation standards. A detailed list of changes follows. NEW FEATURES o There has been a major change in directory structure masterminded by Kurt Hornik. library(.) now attaches ``package''s which are better integrated, see "?library". Packages may be available from outside the RHOME path via the .lib.loc variable. o The documentation format (of files in src/library//man/ ) has changed to a more easily parsable LaTeX like format. The doc files now all end in `.Rd'. etc/Rman2Rd can be used to translate old-style documentation to the new one. The translation to *roff, LaTeX and HTML is now done using etc/Rdconv, written in Perl by Fritz Leisch. The HTML online help produced has now links which work. The manual (in doc/manual/) now includes a section on the documentation format and on mathematical text in graphs. etc/ further contains `Sd2Rd' for (partial) translation of S `.d' documentation to Rd, and `Rd2txt' and `Rd2dvi' for easy previewing of single Rd files. o The use of "names" on one dimensional arrays will now produce sensible results. This means that for most purposes, one dimensional arrays can be treated like vectors. o We have a applied a patch from mward at wolf.hip.berkeley.edu which should substantially improve the speed of (vector) arithmetic. o The modeling formula handler has been expanded so that it accepts y ~ 0 + x as a "through the origin" specification. models with no parameters are now acceptable. o "cov", "cor" and "var" now produce a matrix result if either of their x or y arguments is a matrix. Dimnames are propagated in a sensible fashion. o New chisq.test(.) and prop.test() from Kurt Hornik. o New read.fwf(.) for reading fixed width format (KH). o New str(.) [alternative to summary(.) for programmers] (MM). o New example data sets "esoph", "infert" and "anscombe" (TL), "iris3" (KH) and "stackloss" (MM). o source(.) has several new arguments, notably ``echo = FALSE''. This is applied in the new function demo(.) which runs all the code in demos/ (but dynload). o strheight(.) is new, accompanying strwidth(.). Both now work for mathematical expressions (Paul Murrell). o The LaTeX version of the manual (-> doc/manual/) now has an index. o EVERY *.Rd file in src/library/base/man/ has now at least one \keyword o New package (`library(.)') "stepfun" for step functions, incl. empirical distributions. BUG FIXES o Regular expression matching is now done with system versions of the regexp library. This should fix compilation problems on some platforms. o "approx" and "approxfun", have had some minor adjustments. which fix the interpretation of the rule= argument. The code for piecewise constant case is now internal C code that than interpreted. This should boost performance in this case. o There has been a minor fixup of "model.frame" to ensure that subsets, weights, etc are handled properly. o Model fitting of the form lm(y~., data=df) glm(y~., data=df) will now work. The RHS of the model will consist of an additive model containing all (non-respose) variabels in the given data frame. o The following type of assignment to data frame subsets z <- data.frame(x=rnorm(10),y=rnorm(10),z=rnorm(10)) z[,1:2] <- matrix(1:20,nc=2) was producing incorrect results. The solution was to wrap an implicit "as.data.frame" around the RHS. o "[.data.frame" no longer has a default drop=TRUE argument. This means that subsetting a data frame with "[" will always yield a data frame. o There was a swap of coordinates internally in "mtext" which meant that labels were coming out in the wrong place. Fixed. o Syntax errors in parse(text="...") would cause R to terminate with a segmentation violation. This no longer happens, although the result is still not perfect (the parse() returns). This will be fixed by a future parse rewrite. o rainbow, topo.colors, etc., now also work with n in {1,2}; don't return duplicate neighbor colors anymore. o legend has new `text.width' argument and now also works with mathematical expressions as text. o hist() now works better, has a `plot = TRUE' argument, and returns something useful. o barplot() improved for `names', now returns vector of midpoints. o lm(), lm.fit, lm.wfit (was `lm.w.fit'): Made more compatible. Dealing with (close to) collinear situations is still not flexible enough. o internal postcript() improved (missing lines in boxplot(.)). o Improvement to many (even most ?) documentation (.Rd) files. o Numerous other fixes of minor things ... -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ From ihaka at stat.auckland.ac.nz Sun Dec 7 20:13:41 1997 From: ihaka at stat.auckland.ac.nz (Ross Ihaka) Date: Mon, 8 Dec 1997 08:13:41 +1300 (NZDT) Subject: First Patch to R-0.60 for Unix Message-ID: <199712071913.IAA27288@stat1.stat.auckland.ac.nz> You can find a first patch to R-0.60 at the same archive sites as you obtained your R sources. The patch is called R-0.60-patch1.gz. Alternatively, you can pick a patched distribution which is in the file R-0.60.1.tgz The patch fixes a multiple file closing problem which causes core dumps on some versions of Unix. Ross -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ From Kurt.Hornik at ci.tuwien.ac.at Tue Dec 9 18:48:20 1997 From: Kurt.Hornik at ci.tuwien.ac.at (Kurt Hornik) Date: Tue, 9 Dec 1997 18:48:20 +0100 Subject: R FAQ v0.60 Message-ID: <199712091748.SAA29807@aragorn.ci.tuwien.ac.at> An updated version of the R FAQ to accompany the new 0.60 release is now available at the usual site, http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html A plain text version of the FAQ is appended below. -kh ****** snip snip snip ************************************************** R FAQ Kurt Hornik v0.60-6, 1997/12/08 This document contains answers to some of the most frequently asked questions about R. Feedback is welcome. ______________________________________________________________________ Table of Contents: 1. Introduction 1.1 Legalese 1.2 Obtaining this Document 1.3 Notation 1.4 Feedback 2. R Basics 2.1 What Is R? 2.2 What Machines Does R Run on? 2.3 What Is the Current Version of R? 2.4 How Can R Be Obtained? 2.5 How Can R Be Installed? 2.5.1 How Can R Be Installed (Unix) 2.5.2 How Can R Be Installed (Windows) 2.5.3 How Can R Be Installed (Macintosh) 2.6 Are there Unix Binaries for R? 2.7 Which Documentation Exists for R? 2.8 Which Mailing Lists Exist for R? 2.9 What is CRAN? 3. R and S 3.1 What Is S? 3.2 What Is S-PLUS? 3.3 What Are the Differences between R and S? 3.3.1 Lexical Scoping 3.3.2 Models 3.3.3 Others 4. R Add-On Packages 4.1 Which Add-on Packages Exist for R? 4.2 How Can Add-on Packages Be Installed? 4.3 How Can Add-on Packages Be Used? 4.4 How Can Add-on Packages Be Removed? 4.5 How Can I Create an R Package? 4.6 How Can I Contribute to R? 5. R and Emacs 5.1 Is there Emacs Support for R? 5.2 Should I Run R from Within Emacs? 6. R Miscellania 6.1 How Can I Read a Large Data Set into R? 6.2 Why Can't R Source a `Correct' File? 6.3 How Can I Set Components of a List to NULL? 6.4 How Can I Save My Workspace? 6.5 How Can I Clean Up My Workspace? 6.6 How Can I Get `eval' and `D' to Work? 6.7 Why Do My Matrices Lose Dimensions? 6.8 How Does Autoloading Work? 6.9 How Should I Set Options? 7. Acknowledgments ______________________________________________________________________ 1. Introduction This document contains answers to some of the most frequently asked questions about R. 1.1. Legalese This document is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. A copy of the GNU General Public License is available via WWW at http://www.gnu.org/copyleft/gpl.html. You can also obtain it by writing to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. 1.2. Obtaining this Document The latest version of this document is always available from http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html From there, you can also obtain versions converted to plain ASCII text, GNU info, DVI, and PostScript, as well as the SGML source used for creating all these formats using the SGML-Tools (formerly Linuxdoc-SGML) system. 1.3. Notation Everything should be pretty standard. `R>' is used for the R prompt, and a `$' for the shell prompt (where applicable). 1.4. Feedback Feedback is of course most welcome. In particular, note that I do not have access to Windows or Mac systems. If you have information on these systems that you think should be added to this document, please let me know. 2. R Basics 2.1. What Is R? R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files. The design of R has been heavily influenced by two existing languages: Becker, Chambers & Wilks' S (see question ``What is S?'') and Sussman's Scheme. Whereas the resulting language is very similar in appearance to S, the underlying implementation and semantics are derived from Scheme. See question ``What Are the Differences between R and S?'' for a discussion of the differences between R and S. R was initially written by Robert Gentleman and Ross Ihaka, who are Senior Lecturers at the Department of Statistics of the University of Auckland in Auckland, New Zealand. In addition, a large group of individuals has contributed to R by sending code and bug reports. Since mid-1997 there has been a core group who can modify the R source code CVS archive. The group currently consists of Peter Dalgaard, Robert Gentleman, Kurt Hornik, Ross Ihaka, Thomas Lumley, Martin Maechler, Paul Murrell, Heiner Schwarte, and Luke Tierney. R has a home page at http://stat.auckland.ac.nz/r/r.html. It is free software distributed under a GNU-style copyleft, and an official part of the GNU project (``GNU S''). 2.2. What Machines Does R Run on? R is being developed for the Unix, Windows and Mac platforms. R will configure and build under a number of common Unix platforms including dec-alpha-osf, freebsd, hpux, i386-linux (ELF), sgi-irix, solaris, and sunos, and according to Jim Lindsey also on Mac, Amiga and Atari under m68k-linux. If you know about other platforms, please drop me a note. 2.3. What Is the Current Version of R? The current Unix version is 0.60, the previous version was 0.50. The ``jump'' is due to both a major reorganization of the directory structure and the conversion to a new, TeX-like documentation format. See the file `CHANGES' in the R distribution for more information. With some good luck, the Windows version will soon catch up with the Unix version. The version for the Mac is pre-alpha. 2.4. How Can R Be Obtained? Sources, binaries and documentation for R can be obtained via CRAN, the ``Comprehensive R Archive Network'' (see question ``What is CRAN?''). 2.5. How Can R Be Installed? 2.5.1. How Can R Be Installed (Unix) If binaries are available for your platform (see question ``Are there Unix Binaries for R?''), you can use these, following the instructions that come with them. Otherwise, you can compile and install R yourself, which can be done very easily under a number of common Unix platforms (see question ``What Machines Does R Run on?''). The file INSTALL that comes with the R distribution contains instructions. Choose a place to install the R tree (R is not just a binary, but has additional data sets, help files, font metrics etc). Let's call this place RHOME (given appropriate permissions, a natural choice would be `/usr/local/lib/R'). Untar the source code, and issue the following commands (at the shell prompt): $ ./configure $ make If these commands execute successfully, the R binary will be copied to the `$RHOME/bin' directory. In addition, a shell script font-end called `R' will be created and copied to the same directory. You can copy this script to a place where users can invoke it, for example to `/usr/local/bin'. You could also copy the man page `R.1' to a place where your man reader finds it, such as `/usr/local/man/man1'. Using $ make docs will build preformatted plain text help pages as well as HTML and LaTeX versions of the documentation (the three kinds can also be gen- erated separately using make help, make html and make latex). Note that as of R version 0.60, you need Perl version 5 to build the docu- mentation. If this is not available on your system, you can obtain precompiled documentation files via CRAN. If everything (including docs) built properly (and you do not want to apply patches in the future), you can safely do rm -rf src to free disk space. 2.5.2. How Can R Be Installed (Windows) The file `rsept.zip' from the `bin/ms-windows' directory of a CRAN site contains a binary Windows 95 distribution for R which should be about a 0.50a4 release (plus a few features from 0.60). This version is quite limited in Windows-specific features, although it has been reported to work rather nicely. The file `rseptbeta.zip' contains the same version with a few bugs fixed and some experimental code for dynamic loading of DLL files. The survival4 packages is included but it currently does not work. These versions also work on NT4.0, both server and workstation. The file `rsept31.zip' contains a version compiled for Windows 3.11. There have been mixed reports regarding this one, some get it going with a few inconsequential error messages on startup, others seem to be getting absolutely nowhere with it. It will definitely not run without a version of Win32s installed, available free of charge from Microsoft (ftp://ftp.microsoft.com/Softlib/MSLFILES/pw1118.exe). For reasons related to the lack of long filenames, the HTML help files cannot work and are not included. Note that when uncompressing the zip files, the pkunzip program needs to be invoked with the -D flag to create subdirectories. Also, be aware that some decompression programs do not preserve long file names properly. 2.5.3. How Can R Be Installed (Macintosh) The CRAN `bin/macintosh' directory contains `R.sea.hqx', a binhexed self-extracting archive, and installation instructions in `README.MACINTOSH'. Note that the version in it is nowhere near the quality of the current Unix version. The Power Macintosh port is temporarily on hold. 2.6. Are there Unix Binaries for R? Packages ready for installation under the i386 versions of Debian GNU/Linux and Red Hat Linux, respectively, can be found at CRAN in `bin/i386-linux'. There are also `tar' distributions for NEXTSTEP on the i386 and m68k platforms in `bin/i386-nextstep' and `bin/m68k- nextstep'. No others binary distributions have thus far been made publically available. 2.7. Which Documentation Exists for R? Online documentation for most of the functions and variables in R exists, and can be printed on-screen by typing help(name) (or ?name) at the R prompt, where name is the name of the topic help is sought for. (In the case of unary and binary operators and control-flow special forms, the name may need to be be quoted.) This documentation can also be made available as HTML, and as hardcopy via LaTeX, see question ``How Can R Be Installed?''. An up-to-date HTML version is always available for web browsing at http://www.stat.math.ethz.ch/R/manual/ An R manual (``Notes on R: A Programming Environment for Data Analysis and Graphics'') is currently being written, based on the ``Notes on S-PLUS'' by Bill Venables and David Smith . The current version can be obtained as `Rnotes.tgz' (LaTeX source) in a CRAN `doc' directory. Note that the ``conversion'' from S(-PLUS) to R is not complete yet. Last, but not least, Ross' and Robert's experience in designing and implementing R is described in: @Article{, author = {Ross Ihaka and Robert Gentleman}, title = {R: A Language for Data Analysis and Graphics}, journal = {Journal of Computational and Graphical Statistics}, year = 1996, volume = 5, number = 3, pages = {299--314} } This is also the reference for R to use in publications. 2.8. Which Mailing Lists Exist for R? Thanks to Martin Maechler , there are three mailing lists devoted to R. r-announce This list is for announcements about the development of R and the availability of new code. r-devel This list is for discussions about the future of R and pre- testing of new versions. It is meant for those who maintain an active position in the development of R. r-help The `main' R mailing list, for announcements about the development of R and the availability of new code, questions and answers about problems and solutions using R, enhancements and patches to the source code and documentation of R, comparison and compatibility with S and S-plus, and for the posting of nice examples and benchmarks. Note that the r-announce list is gatewayed into r-help, so you don't need to subscribe to both of them. To send a message to everyone on the r-help mailing list, send email to r-help at stat.math.ethz.ch To subscribe (or unsubscribe) to this list send subscribe (or unsub- scribe) in the BODY of the message (not in the subject!) to r-help- request at stat.math.ethz.ch. Information about the list can be obtained by sending an email with info as its contens to r-help- request at stat.math.ethz.ch. Subscription and posting to the other lists is done analogously, with `r-help' replaced by `r-announce' and `r-devel', respectively. It is recommended that you send mail to r-help rather than only to the R developers (who are also subscribed to the list, of course). This may save them precious time they can use for constantly improving R, and will typically also result in much quicker feedback for yourself. Of course, in the case of bug reports it would be very helpful to have code which reliably reproduces the problem. Also, make sure that you include information on the system and version of R being used. Archives of the above three mailing lists are made available on the net in a monthly schedule at ftp://ftp.stat.math.ethz.ch/Mail- archives/ (which is a directory of mail archive files). Archives of the r-help mailing list (including the previous r-testers lists back to March 1996), are also available in HTML format at http://www.ens.gu.edu.au/robertk/rhelp/about.htm. The developers of R can be reached for comments and reports at R at stat.auckland.ac.nz. 2.9. What is CRAN? The ``Comprehensive R Archive Network'' (CRAN) is a collection of sites which carry identical material, consisting of the R distribution(s), the contributed extensions, documentation for R, and binaries. The CRAN master site can be found at the URL http://www.ci.tuwien.ac.at/R/ (Austria) and is currently being mirrored daily at http://www.stat.unipg.it/pub/stat/statlib/R/CRAN/ (Italy) ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN/ (Japan) ftp://ftp.stat.math.ethz.ch/R-CRAN/ (Switzerland) http://lib.stat.cmu.edu/R/CRAN/ (USA/Pennsylvania) ftp://ftp.biostat.washington.edu/mirrors/R/CRAN/ (USA/Washington) ftp://franz.stat.wisc.edu/pub/R/ (USA/Wisconsin) Please use the CRAN site closest to you to reduce network load. The structure of the CRAN tree is as follows. `src/base' contains the official R distribution as provided by Ross Ihaka and Robert Gentleman. `src/contrib' contains code for extension packages. `doc' is for additional documentation and information on R. `bin' is for prebuilt R binaries (the base distribution and extensions), grouped according to platforms. Currently, there are experimental `.deb' and `.rpm' packages for i386-linux, and tar files for i386-nextstep and m68k-nextstep. I hope that `.tar.gz' files with contents relative to an installation tree (e.g. `bin', `lib/R/', and `man/man1/R.1') can be made available soon for all major supported Unix platforms. To ``submit'' something to CRAN, simply upload it to ftp://ftp.ci.tuwien.ac.at/incoming and send an email to . Please indicate the copyright situation (GPL, ...) in your submission. 3. R and S 3.1. What Is S? S is a very high level language and an environment for data analysis and graphics. S was written by Richard A. Becker, John M. Chambers, and Allan R. Wilks of AT&T Bell Laboratories Statistics Research Department. The primary references for S are two books by the creators of S. o Richard A. Becker, John M. Chambers and Allan R. Wilks (1988), ``The New S Language,'' Chapman & Hall, London. This book is often called the ``Blue Book''. o John M. Chambers and Trevor J. Hastie (1992), ``Statistical Models in S,'' Chapman & Hall, London. This is also called the ``White Book''. There is a huge amount of user-contributed code for S, available at the S Repository at CMU. See the ``Frequently Asked Questions about S'' (http://lib.stat.cmu.edu/S/faq) for further information about S. 3.2. What Is S-PLUS? S-PLUS is a value-added version of S sold by Statistical Sciences, Inc. (now a division of Mathsoft, Inc.) S is a subset of S-PLUS, and hence anything which may be done in S may be done in S-PLUS. In addition S-PLUS has extended functionality in a wide variety areas, including robust regression, modern nonparametric regression, time series, survival analysis, multivariate analysis, classical statistical tests, quality control, and graphics drivers. Add-on modules add additional capabilities for wavelet analysis, spatial statistics, and design of experiments. See the MathSoft S-PLUS page (http://www.mathsoft.com/splus.html) for further information. 3.3. What Are the Differences between R and S? 3.3.1. Lexical Scoping Whereas the developers of R have tried to stick to the S language as defined in ``The New S Language'' (Blue Book, see question ``What is S?''), they have adopted the evaluation model of Scheme. This difference becomes manifest when free variables occur in a function. Free variables are those which are neither formal parameters (occurring in the argument list of the function) nor local variables (created by assigning to them in the body of the function). Whereas S (like C) by default uses static scoping, R (like Scheme) has adopted lexical scoping. This means the values of free variables are determined by a set of global variables in S, but in R by the bindings that were in effect at the time the function was created. Consider the following function: cube <- function(n) { sq <- function() n * n n * sq() } Under S, sq() does not ``know'' about the variable n unless it is defined globally: S> cube(2) Error in sq(): Object "n" not found Dumped S> n <- 3 S> cube(2) [1] 18 In R, the ``environment'' created when cube() was invoked is also looked in: R> cube(2) [1] 8 The following more `realistic' example illustrating the differences in scoping is due to Thomas Lumley . The function jackknife.lm <- function(lmobj) { n <- length(resid(lmobj)) jval <- t(apply(as.matrix(1:n), 1, function(i) coef(update(lmobj, subset = -i)))) (n - 1) * (n - 1) * var(jval) / n } does something useful in R, but does not work in S. In order to make it work in S you need to explicitly pass the linear model object into the function nested in apply(). If you don't and you are lucky you will get ``Error: Object "lmobj" not found''. If you are unlucky enough to have a linear model called lmobj in your global environment you will get the wrong answer with no warning. The following version works in S. jackknife.S.lm <- function(lmobj) { n <- length(resid(lmobj)) jval <- t(apply(as.matrix(1:n), 1, function(i, lmobj) coef(update(lmobj, subset = -i)), lmobj = lmobj)) (n - 1) * (n - 1) * var(jval) / n } (The S version was written independently by Thomas and at least three of his fellow students over the past couple of years, causing liter- ally hours of confusion on each occasion.) Similarly, most optimization (or zero-finding) routines need some arguments to be optimized over and have other parameters that depend on the data but are fixed with respect to optimization. With R scoping rules, this is a trivial problem; simply make up the function with the required definitions in the same environment and scoping takes care of it. With S, one solution is to add an extra parameter to the function and to the optimizer to pass in these extras, which however can only work if the optimizer supports this (and typically, the builtin ones do not). Lexical scoping allows using function closures and maintaining local state. A simple example (taken from Abelson and Sussman) can be found in the `demos/language' subdirectory of the R distribution. Further information is provided in the standard R reference ``R: A Language for Data Analysis and Graphics'' (see question ``Which Documentation Exists for R?'') and a paper on ``Lexical Scope and Statistical Computing'' by Robert Gentleman and Ross Ihaka which can be obtained from the `doc/misc' directory of a CRAN site. Lexical scoping also implies a further major difference. Whereas S stores all objects as separate files in a directory somewhere (usually `.Data' under the current directory), R does not. All objects in R are stored internally. When R is started up it grabs a very large piece of memory and uses it to store the objects. R performs its own memory management of this piece of memory. Having everything in memory is necessary because it is not really possible to externally maintain all relevant ``environments'' of symbol/value pairs. This difference also seems to make R much faster than S. The down side is that if R crashes you will lose all the work for the current session. Saving and restoring the memory ``images'' (the functions and data stored in R's internal memory at any time) can be a bit slow, especially if they are big. In S this does not happen, because everything is saved in disk files and if you crash nothing is likely to happen to them. R is still in a beta stage, and may crash from time to time. Hence, for important work you should consider saving often, see question ``How Can I Save My Workspace?'' (other possibilities are logging your sessions, or have your R commands stored in text files which can be read in using source()). (Note that if you run R from within Emacs (see question ``R and Emacs''), you can save the contents of the interaction buffer to a file and conveniently manipulate it using ess-transcript-mode, as well as save source copies of all functions and data used.) 3.3.2. Models There are some differences in the modeling code, such as o Whereas in S, you would use lm(y ~ x^3) to regress y on x^3 and lm(y ~ poly(x, 3)) to perform ``cubic'' regression, in R, you have to insulate powers of numeric vectors (using I()), i.e., you have to use lm(y ~ I(x^3)) and lm(y ~ x + I(x^2) + I(x^3)), respectively. o The glm family objects are implemented differently in R and S. The same functionality is available but the components have different names. o terms objects are stored differently. In S a terms object is an expression with attributes, in R it is a formula with attributes. The attributes have the same names but are mostly stored differently. The major difference in functionality is that a terms object is subscriptable in S but not in R. If you can't imagine why this would matter then you don't need to know. Also, attr(terms(y~x), "response") give 1 in S and TRUE in R. In S the attribute indicates which column of the model frame will contain the response. In R this always column 1. Finally, in R y~x+0 is an alternative to y~x-1 for specifying a model with no intercept. Models with no parameters at all can be specified by y~0. 3.3.3. Others Apart from lexical scoping and its implications, R follows the S language definition in the Blue Book as much as possible, and hence really is an ``implementation'' of S. There are some intentional differences where the behavior of S is considered ``not clean''. In general, the rationale is that R should help you detect programming errors, while at the same time being as compatible as possible with S. Some known differences are the following. o In R, if x is a list, then x[sub] <- NULL and x[[sub]] <- NULL remove the specified elements from x. The first of these is incompatible with S, where it is a no-op. o In S, the functions named .First and .Last in the `.Data' directory can be used for customizing, as they are executed at the very beginning and end of a session, respectively. R looks for files called `.Rprofile' in the user's home directory and the current directory, and sources these. It also loads a saved image from `.RData' in case there is one. If a .First() function exists then, it is executed. The .Last mechanism is not supported yet. o In R, the .First.lib mechanism when loading add-on packages using library() is not yet supported. o In R, dyn.load() can only load shared libraries, as created for example by `R SHLIB'. o R presently does not support IEEE Inf and NaN. o Whereas in S, abs(z) is the same as Mod(z) for complex z, in R you must use Mod(z), since abs() is a function of real numbers only. o In R, attach() currently only works for lists and data frames (not for directories). Also, you cannot attach at position 1. o Categories do not exist in R, and never will as they are deprecated now in S. Use factors instead. o In R, For() loops are not necessary and hence not supported. o In R, assign() uses the argument envir= rather than where= as in S. o The random number generators are different, and the seeds have different length. o R uses only double precision and so can only pass numeric arguments to C/FORTRAN subroutines as double * or DOUBLE PRECISION, respectively. o R does not allow indexing beyond the end of an array. E.g., if x is a vector of length 5, both x[6] and x[-6] return an error (``subscript out of bounds''). This is a feature, as the R developers feel that indexing beyond array bounds causes bugs in code that are hard to find and in lots of cases only subtly wrong, and typically manifest themselves when least needed. As another example, suppose that DF is a data frame and you want to add a new variable VAR named x to it. In S, you can do DF[["x"]] <- VAR. In R, this is not possible; you can use DF$"x" <- VAR or DF <- cbind(DF, x = VAR). o R currently does not allow recycling when subscripting with logicals. E.g., x <- 1:5; x[c(F, T)] currently gives an error. This is a bug and will be fixed soon. There are also differences which are not intentional, and result from missing or incorrect code in R. The developers would appreciate hearing about any deficiencies you may find (in a written report fully documenting the difference as you see it). Of course, it would be useful if you were to implement the change yourself and make sure it works. 4. R Add-On Packages 4.1. Which Add-on Packages Exist for R? The R distribution comes with the following extra packages: eda Exploratory Data Analysis. Currently only contains functions for robust line fitting, and median polish and smoothing. mva Multivariate Analysis. Currently contains code for principal components (prcomp), canonical correlations (cancor), hierarchichal clustering (hclust), and metric multidimensional scaling (cmdscale). More functions for clustering and scaling, biplots, profile and star plots, and code for ``real'' discriminant analysis will be added soon. The following packages are available from the CRAN `src/contrib' area. Note that R 0.60 has brought a change in both organization of package sources and documentation format, and that some of the packages below may not yet have been updated accordingly. acepack ace (Alternating Conditional Expectations) and avas (Additivity and VAriance Stabilization for regression) for selecting regression transformations. bootstrap Software (bootstrap, cross-validation, jackknife), data and errata for the book ``An Introduction to the Bootstrap'' by B. Efron and R. Tibshirani, 1993, Chapman and Hall. class Functions for classification (k-nearest neighbor and LVQ). clus Functions for cluster analysis. ctest A collection of classical tests, including the Bartlett, Fisher, Kruskal-Wallis, Kolmogorov-Smirnov, and Wilcoxon tests. date Functions for dealing with dates. The most useful of them accepts a vector of input dates in any of the forms 8/30/53, 30Aug53, 30 August 1953, ..., August 30 53, or any mixture of these. e1071 Miscellaneous functions used at the Department of Statistics at TU Wien (E1071). fracdiff Maximum likelihood estimation of the parameters of a fractionally differenced ARIMA(p,d,q) model (Haslett and Raftery, Applied Statistics, 1989). gee An implementation of the Liang/Zeger generalized estimating equation approach to GLMs for dependent data. integrate Code for adaptive quadrature. jpn A function to plot Japan's coast-line and prefecture boundaries. leaps A package which performs an exhaustive search for the best subsets of a given set of potential regressors, using a branch- and-bound algorithm, and also performs searches using a number of less time-consuming techniques. mlbench A collection of artificial and real-world machine learning benchmark problems, including the Boston housing data. nnet Software for feed-forward neural networks with a single hidden layer and for multinomial log-linear models. oz Functions for plotting Australia's coastline and state boundaries. polynom A collection of functions to implement a class for univariate polynomial manipulations. ratetables US national and state mortality data (requires survival4 and date). rational A few small functions to find numerical rational approximations using a continued fraction method. snns An R interface to the Stuttgart Neural Networks Simulator (SNNS). splines Regression spline functions. survival4 Functions for survival analysis (requires splines). wavethresh Code for doing wavelet transforms and thresholding in 1 and 2D. xgobi Interface to the XGobi program for graphical data analysis. See CRAN `src/contrib/INDEX' for more information. Paul Gilbert will make an R version of his package DSE (Dynamic Systems Estimation) shortly after the 0.60 release. The package provides state-space models and the Kalman filter, VARMA and cointegration models, and numerical differentiation. Also, it can do various rational expectation models via an interface to run Troll (a commercially available product) from R. According to Paul, the PADI interface from the Bank of Canada also works with minor changes. PADI can be used to access Fame time series data bases and potentially other databases, even remotely over the Internet. For further information see http://www.bank-banque-canada.ca/pgilbert. Harald Fekjaer has written addreg, a package for additive hazards regression, which can be obtained from http://www.med.uio.no/imb/stat/addreg/. More code has been posted to the r-help mailing list, and can be obtained from the mailing list archive. 4.2. How Can Add-on Packages Be Installed? (Unix only.) The add-on packages on CRAN come as gzipped tar files. ``Unpack'' the package (in a directory that you may write to). If you have GNU tar, you can use tar zxf name, otherwise you can do something like gunzip -c name | tar xf -. Let pkg be the name of the directory thus created. To install the package to the default R directory tree (the `library' subdirectory of `RHOME'), type $ R INSTALL pkg at the shell prompt. To install to another tree (e.g., your private one), use $ R INSTALL pkg lib where lib gives the path to the library tree to install to. You can use several library trees of add-on packages. The easiest way to tell R to use these is via the environment variable RLIBS which should be a colon-separated list of directories at which R library trees are rooted. You do not have to specify the default tree in RLIBS. E.g., to use a private tree in `$HOME/lib/R' and a public site-wide tree in `/usr/local/lib/R/site', put RLIBS="$HOME/lib/R:/usr/local/lib/R/site"; export RLIBS into your (Bourne) shell profile. 4.3. How Can Add-on Packages Be Used? To find out which additional packages are available on your system, type library() at the R prompt. This produces something like Packages in `/home/me/lib/R': mystuff My own R functions, nicely packaged and not documented Packages in `/usr/local/lib/R/library': acepack ace() and avas() for selecting regression transformations bootstrap Functions for the book "An Introduction to the Bootstrap" ctest Classical Tests date Functions for handling dates eda Exploratory Data Analysis fracdiff Fractionally differenced ARIMA(p,d,q) models gee Generalized Estimating Equation models mva Classical Multivariate Analysis splines Regression spline functions survival4 Survival analysis (needs `splines') You can ``load'' the installed package name by library(name) You can then find out which functions it provides by typing one of help(package = name) library(help = name) You can unload the loaded package name by detach("package:name") 4.4. How Can Add-on Packages Be Removed? To remove the package pkg from the default library or the library lib, do $ R REMOVE pkg or $ R REMOVE pkg lib respectively. 4.5. How Can I Create an R Package? A package consists of a subdirectory containing a `TITLE' and `INDEX' file, and subdirectories `R', `man' and optionally `src', `src-c', and `data'. The `TITLE' file contains a line giving the name of the package and a brief description. `INDEX' contains a line for each sufficiently interesting object in the package, giving its name and a description (functions such as print methods not usually called explicitly might not be included). The `R' subdirectory contains R code files with names beginning with lowercase letters. One of these should use library.dynam() to load any necessary compiled code. The `man' subdirectory should contain R documentation files for the objects in the package. Source and a Makefile for the compiled code is in `src', and a pure C version of the source should be in `src-c'. In the common case when all the source is in C it may be convenient to make one of these directories a symbolic link to the other. The `Makefile' will be passed various machine-dependent compile and link flags, examples of which can be seen in the `eda' package. Finally, the `data' subdirectory is for additional data files the package makes available for loading using data(). Note that (at least currently) all such files are in fact R code files, and must have the extension `.R'. See the documentation for library() for more information. The web page http://www.biostat.washington.edu/~thomas/Rlib.html maintained by Thomas Lumley provides information on porting S packages to R. 4.6. How Can I Contribute to R? R is currently still in alpha (or pre-alpha) state, so simply using it and communicating problems is certainly of great value. One place where functionality is still missing is the modeling software as described in ``Statistical Models in S'' (see question ``What is S?''). The functions add1 kappa alias labels drop1 proj are missing; many of these are interpreted functions so anyone that is bored and wants to have a go at implementing them it would be appreci- ated. In addition, only linear and generalized linear models are cur- rently available, aov, gam, loess, tree, and the nonlinear modelling code are not there yet. See also the `PROJECTS' file in the top level R source directory. Many of the packages available at the Statlib S Repository might be worth porting to R. If you are interested in working on any of these projects, please notify Kurt Hornik. 5. R and Emacs 5.1. Is there Emacs Support for R? There is an Emacs-Lisp interface for interactive statistical programming and data analysis called ESS (``Emacs Speaks Statistics''). Languages supported include: S dialects (S 3/4, S-PLUS 3.x, and R), LispStat dialects (XLispStat, ViSta), and SAS. Stata and SPSS dialect (SPSS, Fiasco) support is being examined for possible future implementation (a preliminary Stata mode is distributed). ESS grew out of the desire for bug fixes and extensions to S-mode-4.8 (which was a GNU Emacs interface to S/S-PLUS version 3 only). In particular, XEmacs support as well as extensions to incorporate R were desired. In addition, with new modes being developed for R, Stata, and SAS, it was felt that providing for a unifying framework would eliminate differences in the user interface, as well as to provide for faster development of production tools and statistical analysis. 5.0 has, for its guts, the basic framework from S-mode. However, it has been cleaned, streamlined, brought closer to conformance as a standard GNU Emacs package, and redesigned for modularity and reuse. R support contains code for editing R source code (syntactic indentation and highlighting of source code, partial evaluations of code, loading and error-checking of code, and source code revision maintenance) and documentation (including sending examples to a running R process and previewing), interacting with an inferior R process from within Emacs (command-line editing, searchable command history, command-line completion of R object and file names, quick access to object and search lists, transcript recording, and an interface to the help system), and transcript manipulation (in particular for re-evaluating commands from transcript files). The latest versions of ESS are always available by WWW from http://franz.stat.wisc.edu/pub/ESS/ or ftp://franz.stat.wisc.edu/pub/ESS/, or via CRAN. The HTML version of the documentation can be found at http://www.stat.math.ethz.ch/ESS/. ESS comes with detailed installation instructions. 5.2. Should I Run R from Within Emacs? Yes, definitely. Inferior R mode provides a readline/history mechanism, object name completion, and syntax-based highlighting of the interaction buffer using Font Lock mode, as well as a very convenient interface to the R help system. Of course, it also integrates nicely with the mechanisms for editing R source using Emacs. One can write code in one Emacs buffer and send whole or parts of it for execution to R; this is helpful for both data analysis and programming. One can also seamlessly integrate with a revision control system, in order to maintain a log of changes in your programs and data, as well as to allow for the retrieval of past versions of the code. In addition, it allows you to keep a record of your session, which can also be used for error recovery through the use of the transcript mode. 6. R Miscellania 6.1. How Can I Read a Large Data Set into R? R (currently) uses a static memory model. This means that when it starts up, it asks the operating system to reserve a fixed amount of memory for it. The size of this chunk cannot be changed subsequently. Hence, it can happen that not enough memory was allocated. In these cases, you should restart R with more memory available, using the command line options -n and -v. To understand these options, one needs to know that R maintains separate areas for fixed and variable sized objects. The first of these is allocated as an array of ``cons cells'' (Lisp programmers will know what they are, others may think of them as the building blocks of the language itself, parse trees, etc.), and the second are thrown on a ``heap''. The -n option can be used to specify the number of cons cells (each occupying 16 bytes) which R is to use (the default is 200000), and the -v option to specify the size of the vector heap in megabytes (the default is 2). Only integers are allowed for both options. E.g., to read in a table of 5000 observations on 40 numeric variables, R -v 6 should do. Note that the information on where to find vectors and strings on the heap is stored using cons cells. Thus, it may also be necessary to allocate more space for cons cells in order to perform computations with very ``large'' variable-size objects. You can find out the current memory consumption (the proportion of heap and cons cells used) by typing gc() at the R prompt. This may help you in finding out whether to increase -v or -n. Note that following gcinfo(TRUE), automatic garbage collection always prints memory use statistics. When using read.table(), the memory requirements are in fact higher than anticipated, because the file is first read in as one long string which is then split again. Use scan() if possible in case you run out of memory when reading in a large table. 6.2. Why Can't R Source a `Correct' File? R sometimes has problems parsing a file which does not end in a newline. This can happen for example when Emacs is used for editing the file and next-line-add-newlines is set to nil. To avoid the problem, either set require-final-newline to a non-nil value in one of your Emacs startup files, or make sure R-mode (see question ``Is there Emacs Support for R?'') is used for editing R source files (which locally ensures this setting). Earlier R versions had a similar problem when reading in data files, but this should have been taken care of now. 6.3. How Can I Set Components of a List to NULL? You can use x[i] <- list(NULL) to set component i of the list x to NULL, similarly for named compo- nents. Do not set x[i] or x[[i]] to NULL, because this will remove the corresponding component from the list. For dropping the row names of a matrix x, it may be easier to use rownames(x) <- NULL, similarly for column names. 6.4. How Can I Save My Workspace? The expression save(list = ls(), file = ".RData") saves the objects in the currently active environment (typically the user's .GlobalEnv) to the file `.RData' in the R startup directory. 6.5. How Can I Clean Up My Workspace? To remove all objects in the currently active environment (typically the user's .GlobalEnv), you can do rm(list = ls()) 6.6. How Can I Get `eval' and `D' to Work? Strange things will happen if you use eval(print(x), envir = e) or D(x^2, "x"). The first one will either tell you that "x" is not found, or print the value of the wrong x. The other one will likely return zero if x exists, and an error otherwise. This is because in both cases, the first argument is evaluated in the calling environment first. The result (which should be an object of mode `expression' or `call') is then evaluated or differentiated. What you (most likely) really want is obtained by ``quoting'' the first argument upon surrounding it with expression(). For example, R> D(expression(x^2),"x") 2 * x Although this behavior may initially seem to be rather strange, is perfectly logical. The ``intuitive'' behaviour could easily be implemented, but problems would arise whenever the expression is contained in a variable, passed as a parameter, or is the result of a function call. Consider for instance the semantics in cases like D2 <- function(e, n) D(D(e, n), n) or g <- function(y) eval(substitute(y), sys.frame(sys.parent(n = 2))) g(a * b) See the help pages for more examples. 6.7. Why Do My Matrices Lose Dimensions? When a matrix with a single row or column is created by a subscripting operation, e.g., row <- mat[2, ], it is by default turned into a vector. In a similar way if an array with dimension, say, 2x3x1x4 is created by subscripting it will be coerced into a 2x3x4 array, losing the unnecessary dimension. After much discussion this has been determined to be a feature. To prevent this happening, add the option `drop = FALSE' to the subscripting. For example, rowmatrix <- mat[2, , drop = F] # creates a row matrix colmatrix <- mat[, 2, drop = F] # creates a column matrix a <- b[1, 1, 1, drop = F] # creates a 1x1x1 array The `drop = F' option should be used defensively when programming. For example, the statement somerows <- mat[index, ] will return a vector rather than a matrix if index happens to have length 1, causing errors later in the code. It should probably be rewritten as somerows <- mat[index, , drop = F] 6.8. How Does Autoloading Work? R has a special environment called `.AutoloadEnv'. Using autoload(name, pkg), where name and pkg are strings giving the names of an object and the package containing it, stores some information in this environment. When R tries to evaluate name, it loads the corresponding package pkg and reevaluates name in the new package's environment. Using this mechanism makes R behave as if the package was loaded, but does not occupy memory (yet). See the help page for autoload() for a very nice example. 6.9. How Should I Set Options? The function options() allows setting and examining a variety of global ``options'' which affect the way in which R computes and displays its results. The variable .Options holds the current values of these options, but should never directly be assigned to unless you want to drive yourself crazy---simply pretend that it is a ``read- only'' variable. For example, given test1 <- function(x = pi, dig = 3) { oo <- options(digits = dig); on.exit(options(oo)); cat(.Options$digits, x, "\n") } test2 <- function(x = pi, dig = 3) { .Options$digits <- dig cat(.Options$digits, x, "\n") } we obtain: R> test1() 3 3.14 R> test2() 3 3.141593 What is really used is the global value of .Options, and using options(OPT = VAL) correctly updates it. Local copies of .Options, either in .GlobalEnv or in a function environment (frame), are just silently disregarded. 7. Acknowledgments Of course, many many thanks to Robert and Ross for the R system, and to the package writers and porters for adding to it. Special thanks go to Peter Dalgaard, Paul Gilbert, Fritz Leisch, Jim Lindsey, Thomas Lumley, Martin Maechler, Anthony Rossini, and Andreas Weingessel for their comments which helped me improve this FAQ. More to some soon ... -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ From Friedrich.Leisch at ci.tuwien.ac.at Tue Dec 9 19:21:16 1997 From: Friedrich.Leisch at ci.tuwien.ac.at (Friedrich Leisch) Date: Tue, 9 Dec 1997 19:21:16 +0100 Subject: R-beta: R FAQ v0.60 In-Reply-To: References: <199712091748.SAA29807@aragorn.ci.tuwien.ac.at> Message-ID: <199712091821.TAA08749@galadriel.ci.tuwien.ac.at> >>>>> On 09 Dec 1997 19:11:28 +0100, >>>>> Peter Dalgaard BSA (PDB) wrote: PDB> Kurt Hornik writes: >> code CVS archive. The group currently consists of Peter Dalgaard, >> Robert Gentleman, Kurt Hornik, Ross Ihaka, Thomas Lumley, Martin >> Maechler, Paul Murrell, Heiner Schwarte, and Luke Tierney. PDB> aaaand Beetlebum... (old Spike Jones number) PDB> It seem Friedrich Leisch fell off the list - again! Well, I'm just sitting next door to Kurt, so I'm easy to forget ... :-) fritz -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ From alchemy at inconnect.com Thu Dec 11 08:53:27 1997 From: alchemy at inconnect.com (Anthony Chavez) Date: Thu, 11 Dec 1997 00:53:27 -0700 (MST) Subject: R-beta: time series structures In-Reply-To: <348F139A.8032984A@stat.unipg.it> Message-ID: Please edit headers and keep non-announce topic off the R-announce list. Some of us are subscribed to R-announce for a *reason*. -- ------------------------------------------------------------------------------- Anthony Chavez o \o/ _ o o o-o +===+ o +===+ /|\ | /\ __\o o_| \ / | | /|\ | | alchemy at inconnect.com / \ / \ | \ /) | \\o \| |~~~| Co-"=|~~~| Salt Lake City, Utah o-o o-o o-o o-o \ o\ |___| / \ |___| -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ From p.dalgaard at biostat.ku.dk Sun Dec 21 21:00:08 1997 From: p.dalgaard at biostat.ku.dk (Peter Dalgaard BSA) Date: 21 Dec 1997 21:00:08 +0100 Subject: New version available Message-ID: A new version is available from ftp.stat.auckland.ac.nz:/pub/R/R-0.61.tgz and soon from CRAN mirrors everywhere. As usual, there is at least one unfixed bug in the release, but this time, we hope that we have set things up so that future bugfixes can be released more or less immediately. Merry Christmas! The R core team. ********************************************************************** Here's the top of the CHANGES file: CHANGES IN VERSION R VERSION 0.61 We try to make development more flexible by creating a "CVS branch". This should make it easier to produce patches for obvious bugs in the releases, without having to wait for changes in other areas to stabilize. NEW FEATURES o New functions "all.vars" and "all.names" added. o There has been a small change in the include file structure. All include files now live in RHOME/src/include and are copied to RHOME/include when needed. o The "noquote" functions are now documented. o A new `language' demo, "is.things", is provided. o symnum(.) function o The files in R/library/base/data have had a .R suffix added. BUG_FIXES o A nasty bug which showed when attempt was made to create a zero length call has been fixed. o model.matrix(.) now allows a contrasts argument. o barplot(.) now also works for barplot(table(rpois(100,3))). o make clean ; make now should work; ./Makefile.in eliminated o format(.) is now generic; the default method has a `digits' argument. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-announce-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._