I am sure there is an obvious answer to this that I'm missing but I can't find it. I'm parsing headers of Emails and most have a date like this: "Wed, 16 Nov 2005 05:28:00 -0800" and I can parse that using:
tmp.dat.data <- matrix(unlist(strsplit(headers$Date.line,",")), ncol = 2, byrow = TRUE) before going on to look at the day and date/time data. However, a very few headers I want to parse are missing the initial day of the week and look like this: "15 Nov 2005 09:10:00 +0100" That means that my use of strsplit() results in that date/time part being all of the item in the list for those entries so the effect of matrix(unlist()) is to pull the next list entry "up" in the matrix. Because I happened to have only two errant entries I didn't see what was happening for a moment. (An odd number gives a warning message about dimensions not fitting but an odd number has silently moved things up/left so doesn't: no quarrel with that from me, my stupidity that I was slow to see what was happening!) I'm sure I should be able to find a simple way to get around this but at the moment I can't. Here's a simple, reproducible example: dat <- c("Tue, 15 Nov 2005 09:44:50 EST", "15 Nov 2005 09:10:00 +0100", "Tue, 15 Nov 2005 09:44:50 EST", "Tue, 15 Nov 2005 16:29:57 +0000", "Wed, 16 Nov 2005 07:00:45 EST", "Wed, 16 Nov 2005 05:28:00 -0800", "Wed, 16 Nov 2005 14:06:21 +0000", "15 Nov 2005 09:10:00 +0100") tmp.dat.data <- matrix(unlist(strsplit(dat,",")),ncol = 2, byrow = TRUE) tmp.dat.data comes out as a 7x2 matrix contents: [,1] [,2] [1,] "Tue" " 15 Nov 2005 09:44:50 EST" [2,] "15 Nov 2005 09:10:00 +0100" "Tue" [3,] " 15 Nov 2005 09:44:50 EST" "Tue" [4,] " 15 Nov 2005 16:29:57 +0000" "Wed" [5,] " 16 Nov 2005 07:00:45 EST" "Wed" [6,] " 16 Nov 2005 05:28:00 -0800" "Wed" [7,] " 16 Nov 2005 14:06:21 +0000" "15 Nov 2005 09:10:00 +0100" I'd like an 8x2 matrix with tmp.dat.data[2,1] == "" and tmp.dat.data[8,1] == "" I'm sure there must be a simple way to achieve this by rolling a slightly different variant of strsplit that pads things and then applying that to the input vector but I'm failing to see how to do this at the moment. TIA, Chris -- Applied researcher, neither statistician nor programmer! ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.