[R] Assign factor and levels inside function
Tim Howard
tghoward at gw.dec.state.ny.us
Fri Apr 22 14:39:28 CEST 2005
Aha!
You've just opened the door to another level for this blundering R
user. I even went back to my well-used copy of "An Introduction to R"
to see where I missed this standard approach for processing new data.
Nothing clear but certainly alluded to in many of the function examples.
I don't know why I was stuck in that rut.
I'm sure 99.9% of you on this list know this, but... To be clear for
anyone searching these archives later: Don't bother to ask your
function to make assignments to pos=1 (the global environment), just do
the assignment yourself when calling the function. For example, instead
of coding a function call like this:
processData(dat)
to assign the processed data to pos=1, simply make the assignment when
calling the function:
dat <- processData(dat)
Thanks for being gentle on me, Andy.
Tim
>>> "Liaw, Andy" <andy_liaw at merck.com> 4/21/2005 9:57:22 PM >>>
Tim,
> From: Tim Howard
>
> Andy,
> Thank you for the help. Yes, my question really did seem like I
was
> going through a lot of unnecessary steps just to define levels of a
> variable. But that was just for the example. In my
> application, I bring
> new datasets into R on a daily basis. While the data differs, the
> variables are the same, and the categorical variables have the same
> levels. So I find myself daily applying the same factor and level
> definitions (by cutting and pasting the large chunk of commands from
a
> text file). It really would be simpler to have it wrapped up in a
> function. That's why I asked the question about putting this into a
> function.
> Upon reading your answer, I thought maybe I could use your example
> and use the super-assignment '<<-' in the function. But, your method
> assigns levels, but does not define the var as a factor
> (interesting!).
>
> > levels(y$one) <- seq(1, 9, by=2)
> > y$one
> [1] 1 1 3 3 5 7
> attr(,"levels")
> [1] 1 3 5 7 9
> > is.factor(y$one)
> [1] FALSE
Ouch! "levels<-" is generic, and the default method simply attach the
levels attribute to the object. You need to coerce the object into a
factor
explicitly.
> Unfortunately, whenever I try to use <<- with the dataframe as the
> variable, I get an error message:
>
> > fncFact <- function(datfra){
> + datfra$one <<- factor(datfra$one, levels=c(1,3,5,7,9))
> + }
> > fncFact(y)
> Error in fncFact(y) : Object "datfra" not found
I believe the canonical ways of doing something like this in R is
something
along the line of:
processData <- function(dat) {
dat$f1 <- factor(dat$f1, levels=...)
... ## any other manipulations you want to do
dat
}
Then when you get new data, you just do:
newData <- processData(newData)
HTH,
Andy
>
> Tim
>
> >>> "Liaw, Andy" <andy_liaw at merck.com> 4/20/2005 4:03:24 PM >>>
> Wouldn't it be easier to do this?
>
> > levels(y$one) <- seq(1, 9, by=2)
> > y$one
> [1] 1 1 3 3 5 7
> attr(,"levels")
> [1] 1 3 5 7 9
>
> Andy
>
> > From: Tim Howard
> >
> > R-help,
> > After cogitating for a while, I finally figured out how to
define
> a
> > data.frame column as factor and assign the levels within a
> function...
> > BUT I still need to pass the data.frame and its name
> > separately. I can't
> > seem to find any other way to pass the name of the data.frame,
> rather
> > than the data.frame itself. Any suggestions on how to go
> > about it? Is
> > there something like value(object) or name(object) that I can't
> find?
> >
> > #sample dataframe for this example
> > y <- data.frame(
> > one=c(1,1,3,3,5,7),
> > two=c(2,2,6,6,8,8))
> >
> > > levels(y$one) # check out levels
> > NULL
> >
> > # the function I've come up with
> > fncFact <- function(datfra, datfraNm){
> > datfra$one <- factor(datfra$one, levels=c(1,3,5,7,9))
> > assign(datfraNm, datfra, pos=1)
> > }
> >
> > >fncFact(y, "y")
> > > levels(y$one)
> > [1] "1" "3" "5" "7" "9"
> >
> > I suppose only for aesthetics and simplicity, I'd like to have
only
> > pass the data.frame and get the same result.
> > Thanks in advance,
> > Tim Howard
> >
> >
> > > version
> > _
> > platform i386-pc-mingw32
> > arch i386
> > os mingw32
> > system i386, mingw32
> > status
> > major 2
> > minor 0.1
> > year 2004
> > month 11
> > day 15
> > language R
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> >
> >
>
>
>
> --------------------------------------------------------------
> ----------------
> Notice: This e-mail message, together with any attachments,
contains
> information of Merck & Co., Inc. (One Merck Drive, Whitehouse
Station,
> New Jersey, USA 08889), and/or its affiliates (which may be known
> outside the United States as Merck Frosst, Merck Sharp & Dohme or
MSD
> and in Japan, as Banyu) that may be confidential, proprietary
> copyrighted and/or legally privileged. It is intended solely
> for the use
> of the individual or entity named on this message. If you are not
the
> intended recipient, and have received this message in error, please
> notify us immediately by reply e-mail and then delete it from your
> system.
> --------------------------------------------------------------
> ----------------
>
>
>
------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments,...{{dropped}}
More information about the R-help
mailing list