[R] reason for Factors -- was -- Vector Assignments

Thomas W Blackwell tblackw at umich.edu
Wed Dec 3 14:56:38 CET 2003


On Wed, 3 Dec 2003, Arend P. van der Veen wrote:

> Your recommendations have worked great.  I have found both cut and
> ifelse to be useful.
>
> I have one more question. When should I use factors over a character
> vector.  I know that they have different uses.  However, I am still
> trying to figure out how I can best take advantage of factors.
>
> The following is what I am really trying to do:
>
> colors <- c("red","blue","green","black")
> y.col <- colors[cut(y,c(-Inf,250,500,700,Inf),right=F,lab=F)]
> plot(x,y,col=y.col)
>
> Would using factors make this any cleaner?  I think a character vector
> is all I need but I thought I would ask.
>
> Thanks for your help,
> Arend van der Veen

Arend  -

When setting the colors of plotted points, you definitely want
a vector of character strings as the color names.  "Factor" was
invented so that regression and analysis of variance functions
would properly recognize a grouping variable and not fit simply
a linear coefficient to the integer codes.  In the context of a
linear (or similar) model, each factor or interaction has to be
expanded from a single column of integer codes into a matrix of
[0,1] indicator variables, with a separate column for each possible
level of the factor.  (I oversimplify a bit here: some columns
are omitted, to keep the design matrix from being over-specified.)

-  tom blackwell  -  u michigan medical school  -  ann arbor  -




More information about the R-help mailing list