Here is a solution using strapply in the gsubfn package. First we define a proto object p containing a single method, i.e. function, called fun. fun will take one [...] construct and split it into the numeric vector v using strsplit and will also assign it names. strapply has a built in variable, count, that is maintained automatically in the proto object that will be used for determining which letter to use.
Using strapply apply fun in p to each substring matching this regexp "\\[([01, ]*)\\]". This regexpr matches [ followed by a string of characters made up of 0, 1, comma and space, followed by ] and applies p$fun to each such occurrence. (Modify the regexp appropriately if the true problem has different characteristics.) Finally, simplify = rbind will cause the resulting vectors to be rbind'ed together. (If the different rows of myDF do not have the same structure then omit the simplify = rbind argument of strapply to get out a list.) p <- proto(fun = function(this, x) { v <- as.numeric(strsplit(x, ",")[[1]]) names(v) <- paste(LETTERS[count], seq_along(v), sep = "") v }) strapply(as.character(myDF[[1]]), "\\[([01, ]*)\\]", p, simplify = rbind) Here is what the output looks like: > strapply(as.character(myDF[[1]]), "\\[([01, ]*)\\]", p, simplify = rbind) A1 A2 A3 B1 B2 [1,] 1 0 0 0 1 [2,] 1 1 0 0 1 [3,] 1 0 0 1 1 [4,] 0 0 1 0 1 See http://gsubfn.googlecode.com and the gsubfn vignette for more info. On Thu, Feb 18, 2010 at 3:29 AM, milton ruser <milton.ru...@gmail.com> wrote: > Dear all, > > I have a data.frame with a column like the x shown below > myDF<-data.frame(cbind(x=c("[[1, 0, 0], [0, 1]]", > "[[1, 1, 0], [0, 1]]","[[1, 0, 0], [1, 1]]", > "[[0, 0, 1], [0, 1]]"))) >> myDF > x > 1 [[1, 0, 0], [0, 1]] > 2 [[1, 1, 0], [0, 1]] > 3 [[1, 0, 0], [1, 1]] > 4 [[0, 0, 1], [0, 1]] > > As you can see my x column is composed of some > strings between [[]], and using colon to separate > some "fields". > > I need to identify the numbers of > groups inside the main [ ] and call each > group with different sequential string. > On the example above I would like to have: > > A B > 1 [1, 0, 0] [0, 1] > 2 [1, 1, 0] [0, 1] > 3 [1, 0, 0] [1, 1] > 4 [0, 0, 1] [0, 1] > Although here I have only two groups, my > real dataset will have much more (~30). > After identify the groups I would like > to idenfity the subgroups: > A1 A2 A3 B1 B2 > 1 1 0 0 0 1 > 2 1 1 0 0 1 > 3 1 0 0 1 1 > 4 0 0 1 0 1 > > Any hint are welcome. > > milton ribeiro > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.