[Rd] [Fwd: Re: [R-downunder] Beware unclass(factor)] (PR#9641)
r.darnell at uq.edu.au
r.darnell at uq.edu.au
Mon Apr 30 12:16:07 CEST 2007
This is a multi-part message in MIME format.
--------------040101030901070905010208
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
The following "issue" was found using
> version
_
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 4.1
year 2006
month 12
day 18
svn rev 40228
language R
version.string R version 2.4.1 (2006-12-18)
>
and discussed on the R-downunder mailing list.
I hope I have provided enough info. I tried to look at the Bugs
Tracking page but got---
The system encountered a fatal error
*
cannot open config file /home/sfe/r-bugs/jitterbug/R : No such file or directory
*
The last error code was: No such file or directory
uid/gid=30/8
Regards
Ross Darnell
--------------040101030901070905010208
Content-Type: message/rfc822;
name="Re: [R-downunder] Beware unclass(factor)"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="Re: [R-downunder] Beware unclass(factor)"
Return-path: <john.maindonald at anu.edu.au>
Received: from mail2a.soe.uq.edu.au (mail2a.soe.uq.edu.au [130.102.3.87])
by MAILSTORE (The University of Queensland Central Mail System)
with ESMTP id <0JHB00BUB0WHC0 at anode.soe.uq.edu.au> for r.darnell at uq.edu.au;
Mon, 30 Apr 2007 19:26:41 +1000 (EST)
Received: from mailhub4.uq.edu.au (mailhub4.uq.edu.au [130.102.149.131])
by MAILSTORE (The University of Queensland Central Mail System)
with ESMTP id <0JHB009DL0WH43 at positive.soe.uq.edu.au> for r.darnell at uq.edu.au;
Mon, 30 Apr 2007 19:26:41 +1000 (EST)
Received: from customer-domains.icp-qv1-irony10.iinet.net.au
(customer-domains.icp-qv1-irony10.iinet.net.au [203.59.1.145])
by mailhub4.uq.edu.au (8.13.8/8.13.8) with ESMTP id l3U9QcOd021380 for
<r.darnell at uq.edu.au>; Mon, 30 Apr 2007 19:26:41 +1000
Received: from 203-173-2-10.dyn.iinet.net.au (HELO [192.168.0.2])
([203.173.2.10]) by iinet-mail.icp-qv1-irony10.iinet.net.au with ESMTP; Mon,
30 Apr 2007 17:25:10 +0800
Date: Mon, 30 Apr 2007 19:25:09 +1000
From: John Maindonald <john.maindonald at anu.edu.au>
Subject: Re: [R-downunder] Beware unclass(factor)
In-reply-to: <46359373.50504 at uq.edu.au>
To: Ross Darnell <r.darnell at uq.edu.au>
Cc: r-downunder at stat.auckland.ac.nz
Message-id: <68935773-EB35-4B4F-9970-0D241FDFF73C at anu.edu.au>
MIME-version: 1.0 (Apple Message framework v752.3)
X-Mailer: Apple Mail (2.752.3)
Content-type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
Content-transfer-encoding: 7bit
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgAAAARTNUbLrQIKUGdsb2JhbAANj3wBASo
X-IronPort-AV: i="4.14,469,1170601200"; d="scan'208";
a="80792155:sNHT7461584868"
X-Sorbs: not_in_sorbs
X-Spam-Score: 0 (), 5 = high
X-UQ-Spam-Score: UQ-Spam-Score (0), 5 = high
X-UQ-FilterTime: 1177925201
X-Scanned-By: MIMEDefang 2.58 on UQ Mailhub on 130.102.149.131
References: <46359373.50504 at uq.edu.au>
Original-recipient: rfc822;r.darnell at uq.edu.au
Observe the following
> z <- model.frame(cbind(moths,(20-moths)) ~sex+ doselin,data=worms)
> class(z$doselin)
[1] "other"
> levels(z$doselin)
[1] "1" "2" "4" "8" "16" "32"
> attributes(z$doselin)
$levels
[1] "1" "2" "4" "8" "16" "32"
$class
[1] "other"
The problem surfaces in the call for model.frame() from predict.lm()
when it is called by predict.glm(). This call is jumping to conclusions
when it uses the presence of a levels attribute as an indication that
doselin is a factor, ironic as it was the call that was initiated by glm
that seems to have given the column doselin of the object returned
by model.frame() the class "other".
This seems to me to be a bug. The call to unclass() does not
strip the levels attribute from doselin. (This is not, I think, the
bug; rather the problem is in the model matrix that is created.)
The column worms$doselin does though have class "integer",
at least as far as the function class() is concerned.
You can fix the problem by setting:
worms$doselin <- as.vector(unclass(worms$Dose))
This strips off the levels attribute.
In my view model.frame ought to have stripped the levels
attribute from the column doselin in the object that it
returned.
I consider that this should be reported as a bug, or at least
as an undesirable feature.
John Maindonald email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473 fax : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 30 Apr 2007, at 4:57 PM, Ross Darnell wrote:
> Just an observation about the use of unclass() to generate codes
> for factors.
>
> As an example take the dataset from the MASS4 book
>
> > worms <- data.frame(sex=gl(2,6),Dose=factor(rep(2^(0:5),
> 2)),moths=c(1,4,9,13,18,20,0,2,6,10,12,16))
>
> > worms$doselin <- unclass(worms$Dose)
>
> > worms.glm <- glm(cbind(moths,(20-moths)) ~sex+
> doselin,data=worms,family=binomial)
>
> > predict(worms.glm,new=data.frame(sex="1",doselin=6))
> Error: variable 'doselin' was fitted with class "other" but class
> "numeric" was supplied
> In addition: Warning message:
> variable 'doselin' is not a factor in: model.frame.default(Terms,
> newdata, na.action = na.action, xlev = object$xlevels)
> >
>
>
> The /doselin/ vector is "atomic" --- good enough for the glm()
> function but not acceptable by predict()
>
> > str(worms$doselin)
> atomic [1:12] 1 2 3 4 5 6 1 2 3 4 ...
> - attr(*, "levels")= chr [1:6] "1" "2" "4" "8" ...
> >
>
> Cheers
>
> Ross Darnell
>
> --
> R-downunder at stat.auckland.ac.nz
> http://www.stat.auckland.ac.nz/r-downunder
>
> To unsubscribe send an email to R-downunder-
> unsubscribe at stat.auckland.ac.nz
--------------040101030901070905010208--
More information about the R-devel
mailing list