[R] Factors? I think?
petr.pikal at precheza.cz
Fri Sep 9 09:13:15 CEST 2011
Isn't it something for merge is designed?
> merge(Doctors, DeptCodes, by.x="DocDepts", by.y="Depts")
DocDepts Docs DeptNames
1 1111 Christian\nChristianson Heart
2 5555 Bob Smith Brain
3 9999 Greg Jones Anesthesia
4 9999 Al Franklin Anesthesia
It is easy to get rid of the first column.
> Re: [R] Factors? I think?
> It's probably easiest to think of this as a compound map (doctor -> dept
> code -> factor -> character -> integer -> dept code -> dept name as
> character) and to treat the code as such: if you already have R objects
> the codes in them, it shouldn't be hard to do the transformation.
> Consider the following toy set up
> Docs = factor(c("Greg Jones","Bob Smith","Al Franklin","Christian
> DocDepts = factor(c("9999","5555","9999","1111"))
> Doctors = data.frame(Docs, DocDepts)
> Depts = factor(1:9 * 1111)
> DeptNames =
> DeptCodes = data.frame(Depts,DeptNames)
> # Everything in our data frames is now some sort of factor so we can't
> things up in the "normal" ways
> # Now, you have to do some unpleasantly long but pretty straightforward
> to convert the factors in a way that makes the match properly:
> Doctors$numbers <- as.numeric(as.character(Doctors[,2])) ## Will extract
> "9999" as a real 9999, rather than the internal factor code
> DeptCodes$values <- as.numeric(as.character(DeptCodes[,1]))
> match(Doctors$numbers, DeptCodes$values) ## Will map the department
> onto the correct rows of the DeptCodes df
> # Now we get the correct names using those row numbers
> DeptAssignments = as.character(DeptCodes[match(Doctors$numbers,
> # Combine with doctor names to finish
> NamesandTitles = cbind(as.character(Doctors[,1]),DeptAssignments)
> It's not the most elegant way of doing it, but hopefully it gives some
> insight into how to work with factors. If you can send a little more
> information about how your data is currently stored we can optimize this
> into something easily repeatable but without specifics, I have to work
> Hope this helps,
> Michael Weylandt
> On Thu, Sep 8, 2011 at 6:36 PM, Totally Inept <kramer877 at gmail.com>
> > First of all, let me apologize, as this is probably an absurdly basic
> > question. I did search before asking, but perhaps my ineptitude didn't
> > allow
> > me to apply what I read to what I'm doing. Totally new to R, and
> > done any code in any language in a long time.
> > Basically I've got categories. They're department codes for doctors
> > 9999 for radiology or 5555 for endocrinology), which of course means
> > there are a good number of them, i.e. it's not practical for me to
> > them all out as I usually see in examples of categorical variables
> > (factors).
> > And then I've got a list of doctors that I'm actually interested in. I
> > the department codes associated with each, but I need to map the
> > name to the doctor name. So I might have Greg Jones, Bob Smith, Tom
> > etc... to go with 1234, 9999, 2222, etc.
> > I need to turn Greg Jones, Bob Smith, ... and 1234, 9999, ... into
> > Jones, Bob Smith, ... Cardiology, Radiology, ....
> > Obviously I could just search and replace within the csv files but I
> > something durable that I can run things through repeatedly.
> > Anyhow, thanks to anyone willing to humor me with an answer.
> > --
> > View this message in context:
> > http://r.789695.n4.nabble.com/Factors-I-think-tp3800413p3800413.html
> > Sent from the R help mailing list archive at Nabble.com.
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help