[R] multiple values in one column

Sarah Goslee sarah.goslee at gmail.com
Fri Apr 6 21:59:39 CEST 2012


To the best of my knowledge, you can't skip step #2, at least not with
using much more complicated work-arounds like including a gsub() step
within the call to table, and to everything else you do with those
data.

Computers are generally better at dealing with normalized data, which
is what you're constructing in step #2.

Sarah

On Fri, Apr 6, 2012 at 3:53 PM, John D. Muccigrosso
<internetj at muccigrosso.org> wrote:
> On Apr 6, 2012, at 9:09 AM, John D. Muccigrosso wrote:
>
>> I have some data files in which some fields have multiple values. For example
>>
>> first  last   sex   major
>> John   Smith  M     ANTH
>> Jane   Doe    F     HIST,BIOL
>>
>> What's the best R-like way to handle these data (Jane's major in my example), so that I can do things like summarize the other fields by them (e.g., sex by major)?
>>
>> Right now I'm processing the files (in excel since they're spreadsheets) by duplicating lines with two values in the major field, eliminating one value per row. I suspect there's a nifty R way to do this.
>
>
> I've gotten a few responses, for which I'm grateful, but either I don't quite see how they answer my question, or I didn't phrase my question well, both of which are equally possible. :-)
>
> So, given the data as above, let's call it "students", I have no problem turning it into:
>
> first  last   sex   major
> John   Smith  M     ANTH
> Jane   Doe    F     HIST
> Jane   Doe    F     BIOL
>
> What I then do with this is things like
>
> table(students$sex, students$major)
>
> So, three steps:
>
> 1. Get data with multiple values per field.
> 2. Turn it into a data frame with only one value per field (by duplicating lines).
> 3. Do things like table().
>
> I'd like to be able to skip #2.
>
> Thanks.
>
> John Muccigrosso
>

-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list