[R] Factors? I think?

Petr PIKAL petr.pikal at precheza.cz
Fri Sep 9 09:13:15 CEST 2011

```Hi

Isn't it something for merge is designed?

> merge(Doctors, DeptCodes, by.x="DocDepts", by.y="Depts")
DocDepts                    Docs  DeptNames
1     1111 Christian\nChristianson      Heart
2     5555               Bob Smith      Brain
3     9999              Greg Jones Anesthesia
4     9999             Al Franklin Anesthesia

It is easy to get rid of the first column.

Regards
Petr

> Re: [R] Factors? I think?
>
> It's probably easiest to think of this as a compound map (doctor -> dept
> code -> factor -> character -> integer -> dept code -> dept name as
> character) and to treat the code as such: if you already have R objects
with
> the codes in them, it shouldn't be hard to do the transformation.
>
> Consider the following toy set up
>
> Docs = factor(c("Greg Jones","Bob Smith","Al Franklin","Christian
> Christianson"))
> DocDepts = factor(c("9999","5555","9999","1111"))
> Doctors = data.frame(Docs, DocDepts)
>
> Depts = factor(1:9 * 1111)
> DeptNames =
> factor(c
>
("Heart","Kidney","Feet","Teeth","Brain","Digestive","Diagnostic","Surgery","Anesthesia"))
> DeptCodes = data.frame(Depts,DeptNames)
> # Everything in our data frames is now some sort of factor so we can't
match
> things up in the "normal" ways
>
> # Now, you have to do some unpleasantly long but pretty straightforward
code
> to convert the factors in a way that makes the match properly:
>
> Doctors\$numbers <- as.numeric(as.character(Doctors[,2])) ## Will extract
the
> "9999" as a real 9999, rather than the internal factor code
> DeptCodes\$values <- as.numeric(as.character(DeptCodes[,1]))
>
> match(Doctors\$numbers, DeptCodes\$values) ## Will map the department
numbers
> onto the correct rows of the DeptCodes df
>
> # Now we get the correct names using those row numbers
> DeptAssignments = as.character(DeptCodes[match(Doctors\$numbers,
> DeptCodes\$values),2])
>
> # Combine with doctor names to finish
> NamesandTitles = cbind(as.character(Doctors[,1]),DeptAssignments)
>
> It's not the most elegant way of doing it, but hopefully it gives some
> insight into how to work with factors. If you can send a little more
> information about how your data is currently stored we can optimize this
> into something easily repeatable but without specifics, I have to work
in
> generalities.
>
> Hope this helps,
>
> Michael Weylandt
>
> On Thu, Sep 8, 2011 at 6:36 PM, Totally Inept <kramer877 at gmail.com>
wrote:
>
> > First of all, let me apologize, as this is probably an absurdly basic
> > question. I did search before asking, but perhaps my ineptitude didn't
> > allow
> > me to apply what I read to what I'm doing. Totally new to R, and
haven't
> > done any code in any language in a long time.
> >
> > Basically I've got categories. They're department codes for doctors
(say,
> > 9999 for radiology or 5555 for endocrinology), which of course means
that
> > there are a good number of them, i.e. it's not practical for me to
write
> > them all out as I usually see in examples of categorical variables
> > (factors).
> >
> > And then I've got a list of doctors that I'm actually interested in. I
have
> > the department codes associated with each, but I need to map the
department
> > name to the doctor name. So I might have Greg Jones, Bob Smith, Tom
Wilson,
> > etc... to go with 1234, 9999, 2222, etc.
> >
> > I need to turn Greg Jones, Bob Smith, ... and 1234, 9999, ... into
Greg
> > Jones, Bob Smith, ... Cardiology, Radiology, ....
> >
> > Obviously I could just search and replace within the csv files but I
need
> > something durable that I can run things through repeatedly.
> >
> > Anyhow, thanks to anyone willing to humor me with an answer.
> >
> > --
> > View this message in context:
> > http://r.789695.n4.nabble.com/Factors-I-think-tp3800413p3800413.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>    [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help