Strange Results of summary()
Martin Maechler
Martin Maechler <maechler@stat.math.ethz.ch>
Fri, 20 Mar 1998 09:34:17 +0100
>>>>> "Hubert" == Hubert Palme <palme@uni-wuppertal.de> writes:
Hubert> Martin Maechler:
>> >> berufl >> Bureaukraft :15 >> Guetererzeugung : 9 >> sonstige : 4
>> >> Handel : 3 >> wissensch.-technisch: 3 >> (Other) : 3 >> NA's :43
>>
>> .....
>>
>> >> > table(berufl) >> wissensch.-technisch Leiter
>> Oeff. Dienst/Wirtschaft >> 3 0 >> Bureaukraft Handel >> 15 3 >>
>> Dienstleistungsgewerbe/Soldat Gaertner/Jaeger >> 2 1 >>
>> Guetererzeugung sonstige >> 9 4
>>
>> What's the problem? '(Other)' gives all the levels having (in your
>> case) 0,1,2 observations, which sum to 3 observations.
Hubert> Do I understand you right, that the variables with low
Hubert> frequency are put togehter in (other)? This should be explained
Hubert> to a newbie!!
Hubert> - What criteria decides which variables are put into (other)?
Hubert> - What kind of order do the values have? Frequency?
Hubert> This is very irritating! Where can I get information about all
Hubert> this?
Read the online help ? Read the R-notes, read books about S / S-plus...
More seriously:
1) In situations like these, I just look at the R code;
in this case, you'll find summary -> summary.data.frame -> summary.factor
and you'll see that summary.data.frame is
summary.data.frame(object, maxsum = 7, ...)
where ``maxsum'' is the argument you may want to use differently..
2) The online help for summary has been lacking.
R 0.62 will have an improved help page, whose ASCII version I append at
the end.
>> table() is more detailed (but doesn't report the NA's), which is the
>> only thing to critize here:
Hubert> I agree.
Should be in the 0.62 version...
Hubert> (Hmm... R is a very interesting and powerfull tool, but it's
Hubert> philosophy and terminology need much accustomization for one
Hubert> being familiar with SPSS & Co.)
I agree; implicitly we have often assumed that R users
- either know S / Splus
- or are good programmers
Lack in documentation ``proves'' the above.
And yes, we welcome all collaboration in improving documentation!
Here is the new help page [ ?summary or ?summary.factor or ....] :
Object Summaries
summary(object, ...)
summary.default (object, ..., digits = max(3, .Options$digits -3))
summary.data.frame(object, maxsum = 7, ...)
summary.factor (object, maxsum = 100, ...)
summary.matrix (object, ...)
Arguments:
object: an object for which a summary is desired.
maxsum: integer, indicating how many levels should be
shown for `factor's.
...: additional arguments affecting the summary pro-
duced.
Description:
`summary' is a generic function used to produce result
summaries of the results of various model fitting func-
tions. The function invokes particular `method's which
depend on the `class' of the first argument.
For `factor's, the frequency of the first `maxsum - 1'
most frequent levels is shown, where the less frequent
levels are summarized in `"(Others)"' (resulting in
`maxsum' frequencies).
The functions `summary.lm' and `summary.glm' are exam-
ples of particular methods which summarise the results
produced by `lm' and `glm'.
Value:
The form of the value returned by `summary' depends on
the class of its argument. See the documentation of
the particular methods for details of what is produced
by that method.
See Also:
`anova', `summary.glm', `summary.lm'.
Examples:
options(digits=5)
data(attenu)
summary(attenu) #-> summary.data.frame(..)
summary(attenu $ station, maxsum = 20) #-> summary.factor(..)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._