[R] multiple values in one column
Sarah Goslee
sarah.goslee at gmail.com
Fri Apr 6 21:59:39 CEST 2012
To the best of my knowledge, you can't skip step #2, at least not with
using much more complicated work-arounds like including a gsub() step
within the call to table, and to everything else you do with those
data.
Computers are generally better at dealing with normalized data, which
is what you're constructing in step #2.
Sarah
On Fri, Apr 6, 2012 at 3:53 PM, John D. Muccigrosso
<internetj at muccigrosso.org> wrote:
> On Apr 6, 2012, at 9:09 AM, John D. Muccigrosso wrote:
>
>> I have some data files in which some fields have multiple values. For example
>>
>> first last sex major
>> John Smith M ANTH
>> Jane Doe F HIST,BIOL
>>
>> What's the best R-like way to handle these data (Jane's major in my example), so that I can do things like summarize the other fields by them (e.g., sex by major)?
>>
>> Right now I'm processing the files (in excel since they're spreadsheets) by duplicating lines with two values in the major field, eliminating one value per row. I suspect there's a nifty R way to do this.
>
>
> I've gotten a few responses, for which I'm grateful, but either I don't quite see how they answer my question, or I didn't phrase my question well, both of which are equally possible. :-)
>
> So, given the data as above, let's call it "students", I have no problem turning it into:
>
> first last sex major
> John Smith M ANTH
> Jane Doe F HIST
> Jane Doe F BIOL
>
> What I then do with this is things like
>
> table(students$sex, students$major)
>
> So, three steps:
>
> 1. Get data with multiple values per field.
> 2. Turn it into a data frame with only one value per field (by duplicating lines).
> 3. Do things like table().
>
> I'd like to be able to skip #2.
>
> Thanks.
>
> John Muccigrosso
>
--
Sarah Goslee
http://www.functionaldiversity.org
More information about the R-help
mailing list