Thanks for the reply Jim. Here is a representation of the data I want to analyze - 10 records as requested. Each line can easily include an ID number as below.
So I want to determine a frequency or percentage of respondents that bring each of the 5 foods (Hamburgers, Hot Dogs, Potato Salad, Cole Slaw and BBQ Chicken) and how many "Other" write-ins there are. The same for what else is brought besides food (Softball, Volleyball, Lawn Chairs and Blanket) as well as a count of "Other" write-ins. I'll also need to be able to discern how many brought Hambergers AND a Blanket or how many brought a Softball AND a Vollyball etc. ID|Your Name|Do you picnic?|What is your favorite picnic food?|What do you bring besides food? 1|Yogi Bear|Yes|Hamburgers;#Hot Dogs;#I rely on others to bring the good stuff|"Softball;#Blanket;#I bring boo-boo, but he hides" 2|Boo-Boo|Yes|Potato Salad;#Cole Slaw;#whatever Yogi doesn't eat|Lawn Chairs;#Blanket;#my running shoes 3|Ranger Rick|No|I told you I don't picnic|a big net and handcuffs 4|Magilla Gorilla|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole Slaw;#BBQ Chicken|Softball;#Volleyball;#Lawn Chairs;#Blanket 5|Foghorn Leghorn|Yes|"Hot Dogs;#Cole Slaw;#I say, I say, BBQ Chicken?"|Softball;#Blanket 6|Peter Potamus|Yes|"Hamburgers;#Hot Dogs;#anything, just a lot of it"|Softball;#Lawn Chairs;#hot air balloon 7|Jonny Quest|No|too busy getting into and out of trouble|Hadji and Bandit 8|"Fleegle, Bingo, Drooper and Snorky"|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole Slaw;#A banana split|a laugh track 9|George Jetson|No|Mr. Spacely is making me work|Lawn Chairs;#Blanket;#my flying car 10|Snagglepuss|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole Slaw;#BBQ Chicken|Softball;#Heavens to Murgatroyd! Exit stage left! Thanks in advance, Dale -----Original Message----- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Saturday, July 12, 2008 11:32 AM To: Hohm, Dale Cc: r-help@r-project.org Subject: Re: [R] Reading Multi-value data fields for descriptive analysis Can you provide a more complete example (say 10 lines) of what the input is like. Does each line have a unique index that can be related to it? Do you want to summarize all the multi1-n values of Col2? Do you want to know the percentage of input lines that have a Col3/multi-value4 on them? You could read in the data as you have indicated below and add a column that is the record number and therefore you would have have to worry about trying to say if it existed or not. For example, you might have: Rec#|col#|value 1|1|single 1|2|multi1 1|2|multi2 1|3|multi1 2|1|single 3|1|single 3|2|multi1 .... There are a number of potential ways of representing the data, but a lot depends on what you want to do with it, so a more extensive example of the input, along with the type of output you would like will help in providing an answer. On Sat, Jul 12, 2008 at 12:37 PM, Hohm, Dale <[EMAIL PROTECTED]> wrote: > Hello, > > I'm looking for help on the best approach to get "multi-value" data fields > into R for simple descriptive analysis. > > ------------------------------------- > > I am new to this list and new to R, but I really want to get over the hump > and get productive with it. Some help with how to best get the following > data into R would be greatly appreciated. I have programming experience and > stale experience with SPSS. > > I am trying to do some simple descriptive analysis (frequencies, cross-tabs) > of data stored in a Microsoft SharePoint list. The data can be accessed with > ODBC or it can readily be extracted into an Excel or CSV format. One of the > challenges with the data is that it uses several "multi-value" fields > (Microsoft Access provides the same data-type). > > By "multi-value" I mean that multiple responses are packed into a single data > column; the data input form presents a question with several checkboxes and a > free-format write-in response. The individual values within the data field > are separated with the two characters ";#". So, the data would be of the > following format (in CSV form with column headers and a tilde as the field > separator): > > Column1single~Column2multi~Column3multi > a sample value~C2 a multi one;#C2 a multi two~C3 a multi one;#C3 a multi > two;#C3 a free-form answer > > > The first approach that comes to mind is to explode the multi-value fields > into unique bi-variate data columns and then assign a 0 or 1 to these new > columns in each record based on whether that specific value was present. > This approach is complicated by the free-form answer as the unique columns > could grow very large in number - it might be better to figure out how to > indicate the presence of the free-form value in a data column called "Other" > (or "C2 Other") and then hold the free-form value in a separate column. > > The data would then look like this... > > Column1single: a sample value > C2 a multi one: 1 > C2 a multi two: 1 > C2 a multi three: 0 > C3 a multi one: 1 > C3 a multi two: 1 > C3 a free-form answer: 1 > C3 another free-form answer: 0 > > > Or in the second scenario... > > Column1single: a sample value > C2 a multi one: 1 > C2 a multi two: 1 > C2 a multi three: 0 > C3 a multi one: 1 > C3 a multi two: 1 > C3 Other: 1 > C3 Other Text: a free-form answer > > > I am uncertain help to read this data into R in this format, so suggestions > and examples would help me greatly. > > This is a pretty common data packing scenario, so perhaps there are better > approaches to reading this data and better ways in R to analyze it than what > I have presented. Suggestions greatly appreciated. > > > Thanks, > > Dale Hohm > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.