[R] Factors? I think?

Petr PIKAL petr.pikal at precheza.cz
Fri Sep 9 09:13:15 CEST 2011


Hi

Isn't it something for merge is designed?

> merge(Doctors, DeptCodes, by.x="DocDepts", by.y="Depts")
  DocDepts                    Docs  DeptNames
1     1111 Christian\nChristianson      Heart
2     5555               Bob Smith      Brain
3     9999              Greg Jones Anesthesia
4     9999             Al Franklin Anesthesia

It is easy to get rid of the first column.

Regards
Petr


> Re: [R] Factors? I think?
> 
> It's probably easiest to think of this as a compound map (doctor -> dept
> code -> factor -> character -> integer -> dept code -> dept name as
> character) and to treat the code as such: if you already have R objects 
with
> the codes in them, it shouldn't be hard to do the transformation.
> 
> Consider the following toy set up
> 
> Docs = factor(c("Greg Jones","Bob Smith","Al Franklin","Christian
> Christianson"))
> DocDepts = factor(c("9999","5555","9999","1111"))
> Doctors = data.frame(Docs, DocDepts)
> 
> Depts = factor(1:9 * 1111)
> DeptNames =
> factor(c
> 
("Heart","Kidney","Feet","Teeth","Brain","Digestive","Diagnostic","Surgery","Anesthesia"))
> DeptCodes = data.frame(Depts,DeptNames)
> # Everything in our data frames is now some sort of factor so we can't 
match
> things up in the "normal" ways
> 
> # Now, you have to do some unpleasantly long but pretty straightforward 
code
> to convert the factors in a way that makes the match properly:
> 
> Doctors$numbers <- as.numeric(as.character(Doctors[,2])) ## Will extract 
the
> "9999" as a real 9999, rather than the internal factor code
> DeptCodes$values <- as.numeric(as.character(DeptCodes[,1]))
> 
> match(Doctors$numbers, DeptCodes$values) ## Will map the department 
numbers
> onto the correct rows of the DeptCodes df
> 
> # Now we get the correct names using those row numbers
> DeptAssignments = as.character(DeptCodes[match(Doctors$numbers,
> DeptCodes$values),2])
> 
> # Combine with doctor names to finish
> NamesandTitles = cbind(as.character(Doctors[,1]),DeptAssignments)
> 
> It's not the most elegant way of doing it, but hopefully it gives some
> insight into how to work with factors. If you can send a little more
> information about how your data is currently stored we can optimize this
> into something easily repeatable but without specifics, I have to work 
in
> generalities.
> 
> Hope this helps,
> 
> Michael Weylandt
> 
> On Thu, Sep 8, 2011 at 6:36 PM, Totally Inept <kramer877 at gmail.com> 
wrote:
> 
> > First of all, let me apologize, as this is probably an absurdly basic
> > question. I did search before asking, but perhaps my ineptitude didn't
> > allow
> > me to apply what I read to what I'm doing. Totally new to R, and 
haven't
> > done any code in any language in a long time.
> >
> > Basically I've got categories. They're department codes for doctors 
(say,
> > 9999 for radiology or 5555 for endocrinology), which of course means 
that
> > there are a good number of them, i.e. it's not practical for me to 
write
> > them all out as I usually see in examples of categorical variables
> > (factors).
> >
> > And then I've got a list of doctors that I'm actually interested in. I 
have
> > the department codes associated with each, but I need to map the 
department
> > name to the doctor name. So I might have Greg Jones, Bob Smith, Tom 
Wilson,
> > etc... to go with 1234, 9999, 2222, etc.
> >
> > I need to turn Greg Jones, Bob Smith, ... and 1234, 9999, ... into 
Greg
> > Jones, Bob Smith, ... Cardiology, Radiology, ....
> >
> > Obviously I could just search and replace within the csv files but I 
need
> > something durable that I can run things through repeatedly.
> >
> > Anyhow, thanks to anyone willing to humor me with an answer.
> >
> > --
> > View this message in context:
> > http://r.789695.n4.nabble.com/Factors-I-think-tp3800413p3800413.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list