[R] / Operator not meaningful for factors

Vivek Satsangi vivek.satsangi at gmail.com
Sun Jan 15 13:37:32 CET 2006


Sir,
I made the (incorrect, probably unjustified) deduction of using mode()
based on section 3.1 of "An Introduction to R". Since the write up
talks about the "mode" of an object, and using attr() did not work (it
gives some error saying that "mode of name must be character"), I
tried mode() and reached this incorrect conclusion.

I have had this confusion for a while now about the fact that
something is numeric AND it is a factor, since if it were just a
vector and not a factor, it would still be numeric, as in:
> a <- c (1, 2, 3);
> class(a);
[1] "numeric"

I'll try to think of a way to improve the explanation in "An
Introduction to R" so that the next person coming along does not fall
into the same pit.

Thank you for getting me unstuck,

Vivek

On 1/15/06, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> The mode of a factor is numeric, so your test does not do what you think
> it does.
>
> is.numeric() is the recommended test of a vector being numeric.  I have no
> idea where you got the idea that mode() was a useful test (perhaps you
> could give us the reference you used), but it rather rarely is (typeof is
> usually more informative).
>
> From the summary quoted, Price is clearly a factor.  Test it with
> is.factor.
>
> On Sun, 15 Jan 2006, Vivek Satsangi wrote:
>
> > Folks,
> > I have a very basic question. The solution eludes me perhaps because
> > of my own lack of creativity. I am not attaching a fully reproducible
> > session because the issue may well be becuase of the way the data file
> > is, and the data file is large (and I don't know whether I can legally
> > distribute it). If people can suggest things that might be wrong in my
> > data or the way that I am reading it, I would be most grateful.
> >
> > I get the following error message in the session quoted at the end of
> > this email:
> > / not meaningful for factors in: Ops.factor(BookValuePS, Price)
> >
> > As you can see in that some session, I check that the two vectors
> > being divided are numeric.
>
> (see the request above for your reference here)
>
> > I also check that the divisor is not 0 at any index. I also believe that
> > this is not because of the NA's in the data. My question is, what are
> > other "problems" that can cause the / operator to not be meaningful?
>
> Why not test for factor, since that is what the very helpful error message
> told you the problem was?
>
> > I did try some simple examples to try to get the same error. However,
> > I am not sure how to put the same NA's that one  gets from
> > read.table() into a vector:
> >> a <- c(1, 2, 3, NA);
> >> a
> > [1]  1  2  3 NA
> >> b <- c( 1, 2, 3, 4);
> >> c <- b / a;
> >> b
> > [1] 1 2 3 4
> >> a <- c(1, 2, 3, );
> >> c <- b/a;
> > Warning message:
> > longer object length
> >        is not a multiple of shorter object length in: b/a
> >
> >
> > ******** Quoted Session below ********
> > > explainPriceSimplified <- read.table("combinedClean.csv",
> > +                            sep = ",", header=TRUE);
> >> attach(explainPriceSimplified);
> >> summary(explainPriceSimplified);
> >     Symbol           Date              Price            EPS
> >   BookValuePS
> > XL     :   98   Min.   :19870630   22     :   61   Min.   :-1.401e+05
> >  Min.   :-6.901e+05
> > ZION   :   97   1st Qu.:19910930   26.5   :   61   1st Qu.: 4.650e-01
> >  1st Qu.: 3.892e+00
> > YRCW   :   72   Median :19960331   27.5   :   58   Median : 1.060e+00
> >  Median : 7.882e+00
> > AA     :   71   Mean   :19957688   30     :   58   Mean   :-1.534e+01
> >  Mean   : 1.515e+02
> > ABS    :   71   3rd Qu.:20001231   25     :   56   3rd Qu.: 1.890e+00
> >  3rd Qu.: 1.444e+01
> > ABT    :   71   Max.   :20041231   (Other):29561   Max.   : 5.309e+03
> >  Max.   : 3.366e+06
> > (Other):29624                      NA's   :  249   NA's   : 2.460e+02
> >  NA's   : 4.760e+02
> > FiscalQuarterRep    F12MRet
> > 2004/2F:  482    Min.   :-100.00
> > 2003/4F:  471    1st Qu.:  -8.82
> > 2004/1F:  470    Median :  10.57
> > 2004/3F:  470    Mean   :  13.36
> > 2003/3F:  464    3rd Qu.:  31.12
> > 2003/2F:  463    Max.   :4700.00
> > (Other):27284    NA's   : 463.00
> >> mode(Price)
> > [1] "numeric"
> >> mode(EPS)
> > [1] "numeric"
> >> mode(BookValuePS)
> > [1] "numeric"
> >> BP <- BookValuePS / Price ;
> > Warning message:
> > / not meaningful for factors in: Ops.factor(BookValuePS, Price)
> >> which(Price==0)
> > numeric(0)
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>


--
-- Vivek Satsangi
Student, Rochester, NY USA




More information about the R-help mailing list