[R] Newbie syntax question

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sun Jan 13 15:54:29 CET 2008

jim holtman wrote:
> ?cor.test
> and the help page says:
> formula:   a formula of the form ~ u + v, where each of u and v are
> numeric variables giving the data values for one sample. The samples
> must be of the same length.
Yes, but does that answer the question? Seems to me that Joe knows what 
it does, just not why.

The logic behind this sort of model formulas is slightly warped, but has 
been with us at least since S-PLUS's crosstabs(). R's xtabs() is quite 
similar. The basic idea is that you can describe a contingency table 
with n dimensions using

count ~  f1 + f2 + .... + fn

and leave the count off if it would be the constant 1. The thing that I, 
at least, find a little warped is that "+" in model formulas usually 
prescribes an additive structure, which isn't really the case here. The 
":" operator was arguably a better choice, but, well, another choice was 
made, and has stuck. The final step is to say that correlation between 
two quantitative variables is similar enough to RxC tables for 
categorical variables to let you use the same syntax.  ~x+y does have 
the advantage of not implying a direction as y~x or x~y would do.
> On 1/12/08, Joe Trubisz <jtrubisz at mac.com> wrote:
>> Hi...
>> I'm trying to understand the following syntax:
>> cor.test(~mortality + hardness,data=water,method="pearson")
>> which is the same as:
>> cor.test(water$mortality,water$hardness,data=water,method="pearson")
>> Can anyone point me to the correct doc or explain to me how to
>> interpret "~mortality + hardness"?
>> Thanks,
>> Joe
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

More information about the R-help mailing list