To the best of my knowledge, you can't skip step #2, at least not with
using much more complicated work-arounds like including a gsub() step
within the call to table, and to everything else you do with those
data.

Computers are generally better at dealing with normalized data, which
is what you're constructing in step #2.

Sarah

On Fri, Apr 6, 2012 at 3:53 PM, John D. Muccigrosso
<intern...@muccigrosso.org> wrote:
> On Apr 6, 2012, at 9:09 AM, John D. Muccigrosso wrote:
>
>> I have some data files in which some fields have multiple values. For example
>>
>> first  last   sex   major
>> John   Smith  M     ANTH
>> Jane   Doe    F     HIST,BIOL
>>
>> What's the best R-like way to handle these data (Jane's major in my 
>> example), so that I can do things like summarize the other fields by them 
>> (e.g., sex by major)?
>>
>> Right now I'm processing the files (in excel since they're spreadsheets) by 
>> duplicating lines with two values in the major field, eliminating one value 
>> per row. I suspect there's a nifty R way to do this.
>
>
> I've gotten a few responses, for which I'm grateful, but either I don't quite 
> see how they answer my question, or I didn't phrase my question well, both of 
> which are equally possible. :-)
>
> So, given the data as above, let's call it "students", I have no problem 
> turning it into:
>
> first  last   sex   major
> John   Smith  M     ANTH
> Jane   Doe    F     HIST
> Jane   Doe    F     BIOL
>
> What I then do with this is things like
>
> table(students$sex, students$major)
>
> So, three steps:
>
> 1. Get data with multiple values per field.
> 2. Turn it into a data frame with only one value per field (by duplicating 
> lines).
> 3. Do things like table().
>
> I'd like to be able to skip #2.
>
> Thanks.
>
> John Muccigrosso
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to