Inline ... > On 2019-05-19, at 13:56, Michael Boulineau <michael.p.boulin...@gmail.com> > wrote: > >> b <- "^([0-9-]{10} [0-9:]{8} )[*]{3} (\\w+ \\w+)" > > so the ^ signals that the regex BEGINS with a number (that could be > any number, 0-9) that is only 10 characters long (then there's the > dash in there, too, with the 0-9-, which I assume enabled the regex to > grab the - that's between the numbers in the date)
That's right. Note that within a "character class" the hyphen can have tow meanings: normally it defines a range of characters, but if it appears as the last character before "]" it is a literal hyphen. > , followed by a > single space, followed by a unit that could be any number, again, but > that is only 8 characters long this time. For that one, it will > include the colon, hence the 9:, although for that one ([0-9:]{8} ), Right. > I > don't get why the space is on the inside in that one, after the {8}, The space needs to be preserved between the time and the name. I wrote b <- "^([0-9-]{10} [0-9:]{8} )[*]{3} (\\w+ \\w+)" # space in the first captured expression c <- gsub(b, "\\1<\\2> ", a) ... but I could have written b <- "^([0-9-]{10} [0-9:]{8})[*]{3} (\\w+ \\w+)" c <- gsub(b, "\\1 <\\2> ", a) # space in the substituted string ... same result > whereas the space is on the outside with the other one ^([0-9-]{10} , > directly after the {10}. Why is that? In the second case, I capture without a space, because I don't want the space in the results, after the time. > > Then three *** [*]{3}, then the (\\w+ \\w+)", which Boris explained so > well above. I guess I still don't get why this one seemed to have > deleted the *** out of the mix, plus I still don't why it didn't > remove the *** from the first one. Because the entire first line was not matched since it had a malformed character preceding the date. > > 2016-03-20 19:29:37 *** Jane Doe started a video chat > 2016-03-20 19:30:35 *** John Doe ended a video chat > 2016-04-02 12:59:36 *** Jane Doe started a video chat > 2016-04-02 13:00:43 *** John Doe ended a video chat > 2016-04-02 13:01:08 *** Jane Doe started a video chat > 2016-04-02 13:01:41 *** John Doe ended a video chat > 2016-04-02 13:03:51 *** John Doe started a video chat > 2016-04-02 13:06:35 *** John Doe ended a video chat > > This is a random sample from the beginning of the txt file with no > edits. The ***s were deleted, all but the first one, the one that had > the  but that was taken out by the encoding = "UTF-8". I know that > the function was c <- gsub(b, "\\1<\\2> ", a), so it had a gsub () on > there, the point of which is to do substitution work. > > Oh, I get it, I think. The \\1<\\2> in the gsub () puts the <> around > the names, so that it's consistent with the rest of the data, so that > the names in the text about that aren't enclosed in the <> are > enclosed like the rest of them. But I still don't get why or how the > gsub () replaced the *** with the <>... In gsub(b, "\\1<\\2> ", a) the work is done by the backreferences \\1 and \\2. The expression says: Substitute ALL of the match with the first captured expression, then " <", then the second captured expression, then "> ". The rest of the line is not substituted and appears as-is. > > This one is more straightforward. > >> d <- "^([0-9-]{10}) ([0-9:]{8}) <(\\w+ \\w+)>\\s*(.+)$" > > any number with - for 10 characters, followed by a space. Oh, there's > no space in this one ([0-9:]{8}), after the {8}. Hu. So, then, any > number with : for 8 characters, followed by any two words separated by > a space and enclosed in <>. And then the \\s* is followed by a single > space? Or maybe it puts space on both sides (on the side of the #s to > the left, and then the comment to the right). The (.+)$ is anything > whatsoever until the end. \s is the metacharacter for "whitespace". \s* means zero or more whitespace. I'm matching that OUTSIDE of the captured expression, to removes any leading spaces from the data that goes into the data frame. Cheers, Boris > > Michael > > > On Sun, May 19, 2019 at 4:37 AM Boris Steipe <boris.ste...@utoronto.ca> wrote: >> >> Inline >> >> >> >>> On 2019-05-18, at 20:34, Michael Boulineau <michael.p.boulin...@gmail.com> >>> wrote: >>> >>> It appears to have worked, although there were three little quirks. >>> The ; close(con); rm(con) didn't work for me; the first row of the >>> data.frame was all NAs, when all was said and done; >> >> You will get NAs for lines that can't be matched to the regular expression. >> That's a good thing, it allows you to test whether your assumptions were >> valid for the entire file: >> >> # number of failed strcapture() >> sum(is.na(e$date)) >> >> >>> and then there >>> were still three *** on the same line where the  was apparently >>> deleted. >> >> This is a sign that something else happened with the line that prevented the >> regex from matching. In that case you need to investigate more. I see an >> invalid multibyte character at the beginning of the line you posted below. >> >>> >>>> a <- readLines ("hangouts-conversation-6.txt", encoding = "UTF-8") >>>> b <- "^([0-9-]{10} [0-9:]{8} )[*]{3} (\\w+ \\w+)" >>>> c <- gsub(b, "\\1<\\2> ", a) >>>> head (c) >>> [1] "2016-01-27 09:14:40 *** Jane Doe started a video chat" >>> [2] "2016-01-27 09:15:20 <Jane Doe> >>> https://lh3.googleusercontent.com/-_WQF5kRcnpk/Vqj7J4aK1jI/AAAAAAAAAVA/GVqutPqbSuo/s0/be8ded30-87a6-4e80-bdfa-83ed51591dbf" >> >> [...] >> >>> But, before I do anything else, I'm going to study the regex in this >>> particular code. For example, I'm still not sure why there has to the >>> second \\w+ in the (\\w+ \\w+). Little things like that. >> >> \w is the metacharacter for alphanumeric characters, \w+ designates >> something we could call a word. Thus \w+ \w+ are two words separated by a >> single blank. This corresponds to your example, but, as I wrote previously, >> you need to think very carefully whether this covers all possible cases >> (Could there be only one word? More than one blank? Could letters be >> separated by hyphens or periods?) In most cases we could have more robustly >> matched everything between "<" and ">" (taking care to test what happens if >> the message contains those characters). But for the video chat lines we need >> to make an assumption about what is name and what is not. If "started a >> video chat" is the only possibility in such lines, you can use this >> information instead. If there are other possibilities, you need a different >> strategy. In NLP there is no one-approach-fits-all. >> >> To validate the structure of the names in your transcripts, you can look at >> >> patt <- " <.+?> " # " <any string, not greedy> " >> m <- regexpr(patt, c) >> unique(regmatches(c, m)) >> >> >> >> B. >> >> >> >>> >>> Michael >>> >>> >>> On Sat, May 18, 2019 at 4:30 PM Boris Steipe <boris.ste...@utoronto.ca> >>> wrote: >>>> >>>> This works for me: >>>> >>>> # sample data >>>> c <- character() >>>> c[1] <- "2016-01-27 09:14:40 <Jane Doe> started a video chat" >>>> c[2] <- "2016-01-27 09:15:20 <Jane Doe> https://lh3.googleusercontent.com/" >>>> c[3] <- "2016-01-27 09:15:20 <Jane Doe> Hey " >>>> c[4] <- "2016-01-27 09:15:22 <John Doe> ended a video chat" >>>> c[5] <- "2016-01-27 21:07:11 <Jane Doe> started a video chat" >>>> c[6] <- "2016-01-27 21:26:57 <John Doe> ended a video chat" >>>> >>>> >>>> # regex ^(year) (time) <(word word)>\\s*(string)$ >>>> patt <- "^([0-9-]{10}) ([0-9:]{8}) <(\\w+ \\w+)>\\s*(.+)$" >>>> proto <- data.frame(date = character(), >>>> time = character(), >>>> name = character(), >>>> text = character(), >>>> stringsAsFactors = TRUE) >>>> d <- strcapture(patt, c, proto) >>>> >>>> >>>> >>>> date time name text >>>> 1 2016-01-27 09:14:40 Jane Doe started a video chat >>>> 2 2016-01-27 09:15:20 Jane Doe https://lh3.googleusercontent.com/ >>>> 3 2016-01-27 09:15:20 Jane Doe Hey >>>> 4 2016-01-27 09:15:22 John Doe ended a video chat >>>> 5 2016-01-27 21:07:11 Jane Doe started a video chat >>>> 6 2016-01-27 21:26:57 John Doe ended a video chat >>>> >>>> >>>> >>>> B. >>>> >>>> >>>>> On 2019-05-18, at 18:32, Michael Boulineau >>>>> <michael.p.boulin...@gmail.com> wrote: >>>>> >>>>> Going back and thinking through what Boris and William were saying >>>>> (also Ivan), I tried this: >>>>> >>>>> a <- readLines ("hangouts-conversation-6.csv.txt") >>>>> b <- "^([0-9-]{10} [0-9:]{8} )[*]{3} (\\w+ \\w+)" >>>>> c <- gsub(b, "\\1<\\2> ", a) >>>>>> head (c) >>>>> [1] "2016-01-27 09:14:40 *** Jane Doe started a video chat" >>>>> [2] "2016-01-27 09:15:20 <Jane Doe> >>>>> https://lh3.googleusercontent.com/-_WQF5kRcnpk/Vqj7J4aK1jI/AAAAAAAAAVA/GVqutPqbSuo/s0/be8ded30-87a6-4e80-bdfa-83ed51591dbf" >>>>> [3] "2016-01-27 09:15:20 <Jane Doe> Hey " >>>>> [4] "2016-01-27 09:15:22 <John Doe> ended a video chat" >>>>> [5] "2016-01-27 21:07:11 <Jane Doe> started a video chat" >>>>> [6] "2016-01-27 21:26:57 <John Doe> ended a video chat" >>>>> >>>>> The  is still there, since I forgot to do what Ivan had suggested, >>>>> namely, >>>>> >>>>> a <- readLines(con <- file("hangouts-conversation-6.csv.txt", encoding >>>>> = "UTF-8")); close(con); rm(con) >>>>> >>>>> But then the new code is still turning out only NAs when I apply >>>>> strcapture (). This was what happened next: >>>>> >>>>>> d <- strcapture("^([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2} >>>>> + [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}) +(<[^>]*>) *(.*$)", >>>>> + c, proto=data.frame(stringsAsFactors=FALSE, When="", >>>>> Who="", >>>>> + What="")) >>>>>> head (d) >>>>> When Who What >>>>> 1 <NA> <NA> <NA> >>>>> 2 <NA> <NA> <NA> >>>>> 3 <NA> <NA> <NA> >>>>> 4 <NA> <NA> <NA> >>>>> 5 <NA> <NA> <NA> >>>>> 6 <NA> <NA> <NA> >>>>> >>>>> I've been reading up on regular expressions, too, so this code seems >>>>> spot on. What's going wrong? >>>>> >>>>> Michael >>>>> >>>>> On Fri, May 17, 2019 at 4:28 PM Boris Steipe <boris.ste...@utoronto.ca> >>>>> wrote: >>>>>> >>>>>> Don't start putting in extra commas and then reading this as csv. That >>>>>> approach is broken. The correct approach is what Bill outlined: read >>>>>> everything with readLines(), and then use a proper regular expression >>>>>> with strcapture(). >>>>>> >>>>>> You need to pre-process the object that readLines() gives you: replace >>>>>> the contents of the videochat lines, and make it conform to the format >>>>>> of the other lines before you process it into your data frame. >>>>>> >>>>>> Approximately something like >>>>>> >>>>>> # read the raw data >>>>>> tmp <- readLines("hangouts-conversation-6.csv.txt") >>>>>> >>>>>> # process all video chat lines >>>>>> patt <- "^([0-9-]{10} [0-9:]{8} )[*]{3} (\\w+ \\w+) " # (year time )*** >>>>>> (word word) >>>>>> tmp <- gsub(patt, "\\1<\\2> ", tmp) >>>>>> >>>>>> # next, use strcapture() >>>>>> >>>>>> Note that this makes the assumption that your names are always exactly >>>>>> two words containing only letters. If that assumption is not true, more >>>>>> though needs to go into the regex. But you can test that: >>>>>> >>>>>> patt <- " <\\w+ \\w+> " #" <word word> " >>>>>> sum( ! grepl(patt, tmp))) >>>>>> >>>>>> ... will give the number of lines that remain in your file that do not >>>>>> have a tag that can be interpreted as "Who" >>>>>> >>>>>> Once that is fine, use Bill's approach - or a regular expression of your >>>>>> own design - to create your data frame. >>>>>> >>>>>> Hope this helps, >>>>>> Boris >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On 2019-05-17, at 16:18, Michael Boulineau >>>>>>> <michael.p.boulin...@gmail.com> wrote: >>>>>>> >>>>>>> Very interesting. I'm sure I'll be trying to get rid of the byte order >>>>>>> mark eventually. But right now, I'm more worried about getting the >>>>>>> character vector into either a csv file or data.frame; that way, I can >>>>>>> be able to work with the data neatly tabulated into four columns: >>>>>>> date, time, person, comment. I assume it's a write.csv function, but I >>>>>>> don't know what arguments to put in it. header=FALSE? fill=T? >>>>>>> >>>>>>> Micheal >>>>>>> >>>>>>> On Fri, May 17, 2019 at 1:03 PM Jeff Newmiller >>>>>>> <jdnew...@dcn.davis.ca.us> wrote: >>>>>>>> >>>>>>>> If byte order mark is the issue then you can specify the file encoding >>>>>>>> as "UTF-8-BOM" and it won't show up in your data any more. >>>>>>>> >>>>>>>> On May 17, 2019 12:12:17 PM PDT, William Dunlap via R-help >>>>>>>> <r-help@r-project.org> wrote: >>>>>>>>> The pattern I gave worked for the lines that you originally showed >>>>>>>>> from >>>>>>>>> the >>>>>>>>> data file ('a'), before you put commas into them. If the name is >>>>>>>>> either of >>>>>>>>> the form "<name>" or "***" then the "(<[^>]*>)" needs to be changed so >>>>>>>>> something like "(<[^>]*>|[*]{3})". >>>>>>>>> >>>>>>>>> The " " at the start of the imported data may come from the byte >>>>>>>>> order >>>>>>>>> mark that Windows apps like to put at the front of a text file in >>>>>>>>> UTF-8 >>>>>>>>> or >>>>>>>>> UTF-16 format. >>>>>>>>> >>>>>>>>> Bill Dunlap >>>>>>>>> TIBCO Software >>>>>>>>> wdunlap tibco.com >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, May 17, 2019 at 11:53 AM Michael Boulineau < >>>>>>>>> michael.p.boulin...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> This seemed to work: >>>>>>>>>> >>>>>>>>>>> a <- readLines ("hangouts-conversation-6.csv.txt") >>>>>>>>>>> b <- sub("^(.{10}) (.{8}) (<.+>) (.+$)", "\\1,\\2,\\3,\\4", a) >>>>>>>>>>> b [1:84] >>>>>>>>>> >>>>>>>>>> And the first 85 lines looks like this: >>>>>>>>>> >>>>>>>>>> [83] "2016-06-28 21:02:28 *** Jane Doe started a video chat" >>>>>>>>>> [84] "2016-06-28 21:12:43 *** John Doe ended a video chat" >>>>>>>>>> >>>>>>>>>> Then they transition to the commas: >>>>>>>>>> >>>>>>>>>>> b [84:100] >>>>>>>>>> [1] "2016-06-28 21:12:43 *** John Doe ended a video chat" >>>>>>>>>> [2] "2016-07-01,02:50:35,<John Doe>,hey" >>>>>>>>>> [3] "2016-07-01,02:51:26,<John Doe>,waiting for plane to Edinburgh" >>>>>>>>>> [4] "2016-07-01,02:51:45,<John Doe>,thinking about my boo" >>>>>>>>>> >>>>>>>>>> Even the strange bit on line 6347 was caught by this: >>>>>>>>>> >>>>>>>>>>> b [6346:6348] >>>>>>>>>> [1] "2016-10-21,10:56:29,<John Doe>,John_Doe" >>>>>>>>>> [2] "2016-10-21,10:56:37,<John Doe>,Admit#8242" >>>>>>>>>> [3] "2016-10-21,11:00:13,<Jane Doe>,Okay so you have a discussion" >>>>>>>>>> >>>>>>>>>> Perhaps most awesomely, the code catches spaces that are interposed >>>>>>>>>> into the comment itself: >>>>>>>>>> >>>>>>>>>>> b [4] >>>>>>>>>> [1] "2016-01-27,09:15:20,<Jane Doe>,Hey " >>>>>>>>>>> b [85] >>>>>>>>>> [1] "2016-07-01,02:50:35,<John Doe>,hey" >>>>>>>>>> >>>>>>>>>> Notice whether there is a space after the "hey" or not. >>>>>>>>>> >>>>>>>>>> These are the first two lines: >>>>>>>>>> >>>>>>>>>> [1] "2016-01-27 09:14:40 *** Jane Doe started a video chat" >>>>>>>>>> [2] "2016-01-27,09:15:20,<Jane >>>>>>>>>> Doe>, >>>>>>>>>> >>>>>>>>> https://lh3.googleusercontent.com/-_WQF5kRcnpk/Vqj7J4aK1jI/AAAAAAAAAVA/GVqutPqbSuo/s0/be8ded30-87a6-4e80-bdfa-83ed51591dbf >>>>>>>>>> " >>>>>>>>>> >>>>>>>>>> So, who knows what happened with the  at the beginning of [1] >>>>>>>>>> directly above. But notice how there are no commas in [1] but there >>>>>>>>>> appear in [2]. I don't see why really long ones like [2] directly >>>>>>>>>> above would be a problem, were they to be translated into a csv or >>>>>>>>>> data frame column. >>>>>>>>>> >>>>>>>>>> Now, with the commas in there, couldn't we write this into a csv or a >>>>>>>>>> data.frame? Some of this data will end up being garbage, I imagine. >>>>>>>>>> Like in [2] directly above. Or with [83] and [84] at the top of this >>>>>>>>>> discussion post/email. Embarrassingly, I've been trying to convert >>>>>>>>>> this into a data.frame or csv but I can't manage to. I've been using >>>>>>>>>> the write.csv function, but I don't think I've been getting the >>>>>>>>>> arguments correct. >>>>>>>>>> >>>>>>>>>> At the end of the day, I would like a data.frame and/or csv with the >>>>>>>>>> following four columns: date, time, person, comment. >>>>>>>>>> >>>>>>>>>> I tried this, too: >>>>>>>>>> >>>>>>>>>>> c <- strcapture("^([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2} >>>>>>>>>> + [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}) +(<[^>]*>) *(.*$)", >>>>>>>>>> + a, proto=data.frame(stringsAsFactors=FALSE, >>>>>>>>> When="", >>>>>>>>>> Who="", >>>>>>>>>> + What="")) >>>>>>>>>> >>>>>>>>>> But all I got was this: >>>>>>>>>> >>>>>>>>>>> c [1:100, ] >>>>>>>>>> When Who What >>>>>>>>>> 1 <NA> <NA> <NA> >>>>>>>>>> 2 <NA> <NA> <NA> >>>>>>>>>> 3 <NA> <NA> <NA> >>>>>>>>>> 4 <NA> <NA> <NA> >>>>>>>>>> 5 <NA> <NA> <NA> >>>>>>>>>> 6 <NA> <NA> <NA> >>>>>>>>>> >>>>>>>>>> It seems to have caught nothing. >>>>>>>>>> >>>>>>>>>>> unique (c) >>>>>>>>>> When Who What >>>>>>>>>> 1 <NA> <NA> <NA> >>>>>>>>>> >>>>>>>>>> But I like that it converted into columns. That's a really great >>>>>>>>>> format. With a little tweaking, it'd be a great code for this data >>>>>>>>>> set. >>>>>>>>>> >>>>>>>>>> Michael >>>>>>>>>> >>>>>>>>>> On Fri, May 17, 2019 at 8:20 AM William Dunlap via R-help >>>>>>>>>> <r-help@r-project.org> wrote: >>>>>>>>>>> >>>>>>>>>>> Consider using readLines() and strcapture() for reading such a >>>>>>>>> file. >>>>>>>>>> E.g., >>>>>>>>>>> suppose readLines(files) produced a character vector like >>>>>>>>>>> >>>>>>>>>>> x <- c("2016-10-21 10:35:36 <Jane Doe> What's your login", >>>>>>>>>>> "2016-10-21 10:56:29 <John Doe> John_Doe", >>>>>>>>>>> "2016-10-21 10:56:37 <John Doe> Admit#8242", >>>>>>>>>>> "October 23, 1819 12:34 <Jane Eyre> I am not an angel") >>>>>>>>>>> >>>>>>>>>>> Then you can make a data.frame with columns When, Who, and What by >>>>>>>>>>> supplying a pattern containing three parenthesized capture >>>>>>>>> expressions: >>>>>>>>>>>> z <- strcapture("^([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2} >>>>>>>>>>> [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}) +(<[^>]*>) *(.*$)", >>>>>>>>>>> x, proto=data.frame(stringsAsFactors=FALSE, When="", >>>>>>>>> Who="", >>>>>>>>>>> What="")) >>>>>>>>>>>> str(z) >>>>>>>>>>> 'data.frame': 4 obs. of 3 variables: >>>>>>>>>>> $ When: chr "2016-10-21 10:35:36" "2016-10-21 10:56:29" >>>>>>>>> "2016-10-21 >>>>>>>>>>> 10:56:37" NA >>>>>>>>>>> $ Who : chr "<Jane Doe>" "<John Doe>" "<John Doe>" NA >>>>>>>>>>> $ What: chr "What's your login" "John_Doe" "Admit#8242" NA >>>>>>>>>>> >>>>>>>>>>> Lines that don't match the pattern result in NA's - you might make >>>>>>>>> a >>>>>>>>>> second >>>>>>>>>>> pass over the corresponding elements of x with a new pattern. >>>>>>>>>>> >>>>>>>>>>> You can convert the When column from character to time with >>>>>>>>> as.POSIXct(). >>>>>>>>>>> >>>>>>>>>>> Bill Dunlap >>>>>>>>>>> TIBCO Software >>>>>>>>>>> wdunlap tibco.com >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, May 16, 2019 at 8:30 PM David Winsemius >>>>>>>>> <dwinsem...@comcast.net> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 5/16/19 3:53 PM, Michael Boulineau wrote: >>>>>>>>>>>>> OK. So, I named the object test and then checked the 6347th >>>>>>>>> item >>>>>>>>>>>>> >>>>>>>>>>>>>> test <- readLines ("hangouts-conversation.txt) >>>>>>>>>>>>>> test [6347] >>>>>>>>>>>>> [1] "2016-10-21 10:56:37 <John Doe> Admit#8242" >>>>>>>>>>>>> >>>>>>>>>>>>> Perhaps where it was getting screwed up is, since the end of >>>>>>>>> this is >>>>>>>>>> a >>>>>>>>>>>>> number (8242), then, given that there's no space between the >>>>>>>>> number >>>>>>>>>>>>> and what ought to be the next row, R didn't know where to draw >>>>>>>>> the >>>>>>>>>>>>> line. Sure enough, it looks like this when I go to the original >>>>>>>>> file >>>>>>>>>>>>> and control f "#8242" >>>>>>>>>>>>> >>>>>>>>>>>>> 2016-10-21 10:35:36 <Jane Doe> What's your login >>>>>>>>>>>>> 2016-10-21 10:56:29 <John Doe> John_Doe >>>>>>>>>>>>> 2016-10-21 10:56:37 <John Doe> Admit#8242 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> An octothorpe is an end of line signifier and is interpreted as >>>>>>>>>> allowing >>>>>>>>>>>> comments. You can prevent that interpretation with suitable >>>>>>>>> choice of >>>>>>>>>>>> parameters to `read.table` or `read.csv`. I don't understand why >>>>>>>>> that >>>>>>>>>>>> should cause anu error or a failure to match that pattern. >>>>>>>>>>>> >>>>>>>>>>>>> 2016-10-21 11:00:13 <Jane Doe> Okay so you have a discussion >>>>>>>>>>>>> >>>>>>>>>>>>> Again, it doesn't look like that in the file. Gmail >>>>>>>>> automatically >>>>>>>>>>>>> formats it like that when I paste it in. More to the point, it >>>>>>>>> looks >>>>>>>>>>>>> like >>>>>>>>>>>>> >>>>>>>>>>>>> 2016-10-21 10:35:36 <Jane Doe> What's your login2016-10-21 >>>>>>>>> 10:56:29 >>>>>>>>>>>>> <John Doe> John_Doe2016-10-21 10:56:37 <John Doe> >>>>>>>>>> Admit#82422016-10-21 >>>>>>>>>>>>> 11:00:13 <Jane Doe> Okay so you have a discussion >>>>>>>>>>>>> >>>>>>>>>>>>> Notice Admit#82422016. So there's that. >>>>>>>>>>>>> >>>>>>>>>>>>> Then I built object test2. >>>>>>>>>>>>> >>>>>>>>>>>>> test2 <- sub("^(.{10}) (.{8}) (<.+>) (.+$)", "//1,//2,//3,//4", >>>>>>>>> test) >>>>>>>>>>>>> >>>>>>>>>>>>> This worked for 84 lines, then this happened. >>>>>>>>>>>> >>>>>>>>>>>> It may have done something but as you later discovered my first >>>>>>>>> code >>>>>>>>>> for >>>>>>>>>>>> the pattern was incorrect. I had tested it (and pasted in the >>>>>>>>> results >>>>>>>>>> of >>>>>>>>>>>> the test) . The way to refer to a capture class is with >>>>>>>>> back-slashes >>>>>>>>>>>> before the numbers, not forward-slashes. Try this: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> newvec <- sub("^(.{10}) (.{8}) (<.+>) (.+$)", >>>>>>>>> "\\1,\\2,\\3,\\4", >>>>>>>>>> chrvec) >>>>>>>>>>>>> newvec >>>>>>>>>>>> [1] "2016-07-01,02:50:35,<john>,hey" >>>>>>>>>>>> [2] "2016-07-01,02:51:26,<jane>,waiting for plane to Edinburgh" >>>>>>>>>>>> [3] "2016-07-01,02:51:45,<john>,thinking about my boo" >>>>>>>>>>>> [4] "2016-07-01,02:52:07,<jane>,nothing crappy has happened, >>>>>>>>> not >>>>>>>>>> really" >>>>>>>>>>>> [5] "2016-07-01,02:52:20,<john>,plane went by pretty fast, >>>>>>>>> didn't >>>>>>>>>> sleep" >>>>>>>>>>>> [6] "2016-07-01,02:54:08,<jane>,no idea what time it is or >>>>>>>>> where I am >>>>>>>>>>>> really" >>>>>>>>>>>> [7] "2016-07-01,02:54:17,<john>,just know it's london" >>>>>>>>>>>> [8] "2016-07-01,02:56:44,<jane>,you are probably asleep" >>>>>>>>>>>> [9] "2016-07-01,02:58:45,<jane>,I hope fish was fishy in a good >>>>>>>>> eay" >>>>>>>>>>>> [10] "2016-07-01 02:58:56 <jone>" >>>>>>>>>>>> [11] "2016-07-01 02:59:34 <jane>" >>>>>>>>>>>> [12] "2016-07-01,03:02:48,<john>,British security is a little >>>>>>>>> more >>>>>>>>>>>> rigorous..." >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I made note of the fact that the 10th and 11th lines had no >>>>>>>>> commas. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> test2 [84] >>>>>>>>>>>>> [1] "2016-06-28 21:12:43 *** John Doe ended a video chat" >>>>>>>>>>>> >>>>>>>>>>>> That line didn't have any "<" so wasn't matched. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> You could remove all none matching lines for pattern of >>>>>>>>>>>> >>>>>>>>>>>> dates<space>times<space>"<"<name>">"<space><anything> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> with: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> chrvec <- chrvec[ grepl("^.{10} .{8} <.+> .+$)", chrvec)] >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Do read: >>>>>>>>>>>> >>>>>>>>>>>> ?read.csv >>>>>>>>>>>> >>>>>>>>>>>> ?regex >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>> test2 [85] >>>>>>>>>>>>> [1] "//1,//2,//3,//4" >>>>>>>>>>>>>> test [85] >>>>>>>>>>>>> [1] "2016-07-01 02:50:35 <John Doe> hey" >>>>>>>>>>>>> >>>>>>>>>>>>> Notice how I toggled back and forth between test and test2 >>>>>>>>> there. So, >>>>>>>>>>>>> whatever happened with the regex, it happened in the switch >>>>>>>>> from 84 >>>>>>>>>> to >>>>>>>>>>>>> 85, I guess. It went on like >>>>>>>>>>>>> >>>>>>>>>>>>> [990] "//1,//2,//3,//4" >>>>>>>>>>>>> [991] "//1,//2,//3,//4" >>>>>>>>>>>>> [992] "//1,//2,//3,//4" >>>>>>>>>>>>> [993] "//1,//2,//3,//4" >>>>>>>>>>>>> [994] "//1,//2,//3,//4" >>>>>>>>>>>>> [995] "//1,//2,//3,//4" >>>>>>>>>>>>> [996] "//1,//2,//3,//4" >>>>>>>>>>>>> [997] "//1,//2,//3,//4" >>>>>>>>>>>>> [998] "//1,//2,//3,//4" >>>>>>>>>>>>> [999] "//1,//2,//3,//4" >>>>>>>>>>>>> [1000] "//1,//2,//3,//4" >>>>>>>>>>>>> >>>>>>>>>>>>> up until line 1000, then I reached max.print. >>>>>>>>>>>> >>>>>>>>>>>>> Michael >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, May 16, 2019 at 1:05 PM David Winsemius < >>>>>>>>>> dwinsem...@comcast.net> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 5/16/19 12:30 PM, Michael Boulineau wrote: >>>>>>>>>>>>>>> Thanks for this tip on etiquette, David. I will be sure and >>>>>>>>> not do >>>>>>>>>>>> that again. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I tried the read.fwf from the foreign package, with a code >>>>>>>>> like >>>>>>>>>> this: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> d <- read.fwf("hangouts-conversation.txt", >>>>>>>>>>>>>>> widths= c(10,10,20,40), >>>>>>>>>>>>>>> >>>>>>>>> col.names=c("date","time","person","comment"), >>>>>>>>>>>>>>> strip.white=TRUE) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But it threw this error: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Error in scan(file = file, what = what, sep = sep, quote = >>>>>>>>> quote, >>>>>>>>>> dec >>>>>>>>>>>> = dec, : >>>>>>>>>>>>>>> line 6347 did not have 4 elements >>>>>>>>>>>>>> >>>>>>>>>>>>>> So what does line 6347 look like? (Use `readLines` and print >>>>>>>>> it >>>>>>>>>> out.) >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Interestingly, though, the error only happened when I >>>>>>>>> increased the >>>>>>>>>>>>>>> width size. But I had to increase the size, or else I >>>>>>>>> couldn't >>>>>>>>>> "see" >>>>>>>>>>>>>>> anything. The comment was so small that nothing was being >>>>>>>>>> captured by >>>>>>>>>>>>>>> the size of the column. so to speak. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It seems like what's throwing me is that there's no comma >>>>>>>>> that >>>>>>>>>>>>>>> demarcates the end of the text proper. For example: >>>>>>>>>>>>>> Not sure why you thought there should be a comma. Lines >>>>>>>>> usually end >>>>>>>>>>>>>> with <cr> and or a <lf>. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Once you have the raw text in a character vector from >>>>>>>>> `readLines` >>>>>>>>>> named, >>>>>>>>>>>>>> say, 'chrvec', then you could selectively substitute commas >>>>>>>>> for >>>>>>>>>> spaces >>>>>>>>>>>>>> with regex. (Now that you no longer desire to remove the dates >>>>>>>>> and >>>>>>>>>>>> times.) >>>>>>>>>>>>>> >>>>>>>>>>>>>> sub("^(.{10}) (.{8}) (<.+>) (.+$)", "//1,//2,//3,//4", chrvec) >>>>>>>>>>>>>> >>>>>>>>>>>>>> This will not do any replacements when the pattern is not >>>>>>>>> matched. >>>>>>>>>> See >>>>>>>>>>>>>> this test: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> newvec <- sub("^(.{10}) (.{8}) (<.+>) (.+$)", >>>>>>>>> "\\1,\\2,\\3,\\4", >>>>>>>>>>>> chrvec) >>>>>>>>>>>>>>> newvec >>>>>>>>>>>>>> [1] "2016-07-01,02:50:35,<john>,hey" >>>>>>>>>>>>>> [2] "2016-07-01,02:51:26,<jane>,waiting for plane to >>>>>>>>> Edinburgh" >>>>>>>>>>>>>> [3] "2016-07-01,02:51:45,<john>,thinking about my boo" >>>>>>>>>>>>>> [4] "2016-07-01,02:52:07,<jane>,nothing crappy has >>>>>>>>> happened, not >>>>>>>>>>>> really" >>>>>>>>>>>>>> [5] "2016-07-01,02:52:20,<john>,plane went by pretty fast, >>>>>>>>> didn't >>>>>>>>>>>> sleep" >>>>>>>>>>>>>> [6] "2016-07-01,02:54:08,<jane>,no idea what time it is or >>>>>>>>> where >>>>>>>>>> I am >>>>>>>>>>>>>> really" >>>>>>>>>>>>>> [7] "2016-07-01,02:54:17,<john>,just know it's london" >>>>>>>>>>>>>> [8] "2016-07-01,02:56:44,<jane>,you are probably asleep" >>>>>>>>>>>>>> [9] "2016-07-01,02:58:45,<jane>,I hope fish was fishy in a >>>>>>>>> good >>>>>>>>>> eay" >>>>>>>>>>>>>> [10] "2016-07-01 02:58:56 <jone>" >>>>>>>>>>>>>> [11] "2016-07-01 02:59:34 <jane>" >>>>>>>>>>>>>> [12] "2016-07-01,03:02:48,<john>,British security is a little >>>>>>>>> more >>>>>>>>>>>>>> rigorous..." >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> You should probably remove the "empty comment" lines. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> David. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2016-07-01 15:34:30 <John Doe> Lame. We were in a >>>>>>>>>> starbucks2016-07-01 >>>>>>>>>>>>>>> 15:35:02 <Jane Doe> Hmm that's interesting2016-07-01 15:35:09 >>>>>>>>> <Jane >>>>>>>>>>>>>>> Doe> You must want coffees2016-07-01 15:35:25 <John Doe> >>>>>>>>> There was >>>>>>>>>>>>>>> lots of Starbucks in my day2016-07-01 15:35:47 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It was interesting, too, when I pasted the text into the >>>>>>>>> email, it >>>>>>>>>>>>>>> self-formatted into the way I wanted it to look. I had to >>>>>>>>> manually >>>>>>>>>>>>>>> make it look like it does above, since that's the way that it >>>>>>>>>> looks in >>>>>>>>>>>>>>> the txt file. I wonder if it's being organized by XML or >>>>>>>>> something. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Anyways, There's always a space between the two sideways >>>>>>>>> carrots, >>>>>>>>>> just >>>>>>>>>>>>>>> like there is right now: <John Doe> See. Space. And there's >>>>>>>>> always >>>>>>>>>> a >>>>>>>>>>>>>>> space between the data and time. Like this. 2016-07-01 >>>>>>>>> 15:34:30 >>>>>>>>>> See. >>>>>>>>>>>>>>> Space. But there's never a space between the end of the >>>>>>>>> comment and >>>>>>>>>>>>>>> the next date. Like this: We were in a starbucks2016-07-01 >>>>>>>>> 15:35:02 >>>>>>>>>>>>>>> See. starbucks and 2016 are smooshed together. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This code is also on the table right now too. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> a <- read.table("E:/working >>>>>>>>>>>>>>> directory/-189/hangouts-conversation2.txt", quote="\"", >>>>>>>>>>>>>>> comment.char="", fill=TRUE) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> h<-cbind(hangouts.conversation2[,1:2],hangouts.conversation2[,3:5],hangouts.conversation2[,6:9]) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> aa<-gsub("[^[:digit:]]","",h) >>>>>>>>>>>>>>> my.data.num <- as.numeric(str_extract(h, "[0-9]+")) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Those last lines are a work in progress. I wish I could >>>>>>>>> import a >>>>>>>>>>>>>>> picture of what it looks like when it's translated into a >>>>>>>>> data >>>>>>>>>> frame. >>>>>>>>>>>>>>> The fill=TRUE helped to get the data in table that kind of >>>>>>>>> sort of >>>>>>>>>>>>>>> works, but the comments keep bleeding into the data and time >>>>>>>>>> column. >>>>>>>>>>>>>>> It's like >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2016-07-01 15:59:17 <Jane Doe> Seriously I've never been >>>>>>>>>>>>>>> over there >>>>>>>>>>>>>>> 2016-07-01 15:59:27 <Jane Doe> It confuses me :( >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And then, maybe, the "seriously" will be in a column all to >>>>>>>>>> itself, as >>>>>>>>>>>>>>> will be the "I've'"and the "never" etc. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I will use a regular expression if I have to, but it would be >>>>>>>>> nice >>>>>>>>>> to >>>>>>>>>>>>>>> keep the dates and times on there. Originally, I thought they >>>>>>>>> were >>>>>>>>>>>>>>> meaningless, but I've since changed my mind on that count. >>>>>>>>> The >>>>>>>>>> time of >>>>>>>>>>>>>>> day isn't so important. But, especially since, say, Gmail >>>>>>>>> itself >>>>>>>>>> knows >>>>>>>>>>>>>>> how to quickly recognize what it is, I know it can be done. I >>>>>>>>> know >>>>>>>>>>>>>>> this data has structure to it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Michael >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, May 15, 2019 at 8:47 PM David Winsemius < >>>>>>>>>>>> dwinsem...@comcast.net> wrote: >>>>>>>>>>>>>>>> On 5/15/19 4:07 PM, Michael Boulineau wrote: >>>>>>>>>>>>>>>>> I have a wild and crazy text file, the head of which looks >>>>>>>>> like >>>>>>>>>> this: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2016-07-01 02:50:35 <john> hey >>>>>>>>>>>>>>>>> 2016-07-01 02:51:26 <jane> waiting for plane to Edinburgh >>>>>>>>>>>>>>>>> 2016-07-01 02:51:45 <john> thinking about my boo >>>>>>>>>>>>>>>>> 2016-07-01 02:52:07 <jane> nothing crappy has happened, not >>>>>>>>>> really >>>>>>>>>>>>>>>>> 2016-07-01 02:52:20 <john> plane went by pretty fast, >>>>>>>>> didn't >>>>>>>>>> sleep >>>>>>>>>>>>>>>>> 2016-07-01 02:54:08 <jane> no idea what time it is or where >>>>>>>>> I am >>>>>>>>>>>> really >>>>>>>>>>>>>>>>> 2016-07-01 02:54:17 <john> just know it's london >>>>>>>>>>>>>>>>> 2016-07-01 02:56:44 <jane> you are probably asleep >>>>>>>>>>>>>>>>> 2016-07-01 02:58:45 <jane> I hope fish was fishy in a good >>>>>>>>> eay >>>>>>>>>>>>>>>>> 2016-07-01 02:58:56 <jone> >>>>>>>>>>>>>>>>> 2016-07-01 02:59:34 <jane> >>>>>>>>>>>>>>>>> 2016-07-01 03:02:48 <john> British security is a little >>>>>>>>> more >>>>>>>>>>>> rigorous... >>>>>>>>>>>>>>>> Looks entirely not-"crazy". Typical log file format. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Two possibilities: 1) Use `read.fwf` from pkg foreign; 2) >>>>>>>>> Use >>>>>>>>>> regex >>>>>>>>>>>>>>>> (i.e. the sub-function) to strip everything up to the "<". >>>>>>>>> Read >>>>>>>>>>>>>>>> `?regex`. Since that's not a metacharacters you could use a >>>>>>>>>> pattern >>>>>>>>>>>>>>>> ".+<" and replace with "". >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And do read the Posting Guide. Cross-posting to >>>>>>>>> StackOverflow and >>>>>>>>>>>> Rhelp, >>>>>>>>>>>>>>>> at least within hours of each, is considered poor manners. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> David. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> It goes on for a while. It's a big file. But I feel like >>>>>>>>> it's >>>>>>>>>> going >>>>>>>>>>>> to >>>>>>>>>>>>>>>>> be difficult to annotate with the coreNLP library or >>>>>>>>> package. I'm >>>>>>>>>>>>>>>>> doing natural language processing. In other words, I'm >>>>>>>>> curious >>>>>>>>>> as to >>>>>>>>>>>>>>>>> how I would shave off the dates, that is, to make it look >>>>>>>>> like: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> <john> hey >>>>>>>>>>>>>>>>> <jane> waiting for plane to Edinburgh >>>>>>>>>>>>>>>>> <john> thinking about my boo >>>>>>>>>>>>>>>>> <jane> nothing crappy has happened, not really >>>>>>>>>>>>>>>>> <john> plane went by pretty fast, didn't sleep >>>>>>>>>>>>>>>>> <jane> no idea what time it is or where I am really >>>>>>>>>>>>>>>>> <john> just know it's london >>>>>>>>>>>>>>>>> <jane> you are probably asleep >>>>>>>>>>>>>>>>> <jane> I hope fish was fishy in a good eay >>>>>>>>>>>>>>>>> <jone> >>>>>>>>>>>>>>>>> <jane> >>>>>>>>>>>>>>>>> <john> British security is a little more rigorous... >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> To be clear, then, I'm trying to clean a large text file by >>>>>>>>>> writing a >>>>>>>>>>>>>>>>> regular expression? such that I create a new object with no >>>>>>>>>> numbers >>>>>>>>>>>> or >>>>>>>>>>>>>>>>> dates. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Michael >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and >>>>>>>>> more, >>>>>>>>>> see >>>>>>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>>>>>>> and provide commented, minimal, self-contained, >>>>>>>>> reproducible >>>>>>>>>> code. >>>>>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, >>>>>>>>> see >>>>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>> code. >>>>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, >>>>>>>>> see >>>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>> code. >>>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, >>>>>>>>> see >>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>> code. >>>>>>>>>>>> >>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>> and provide commented, minimal, self-contained, reproducible >>>>>>>>> code. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [[alternative HTML version deleted]] >>>>>>>>>>> >>>>>>>>>>> ______________________________________________ >>>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>>>> >>>>>>>>>> ______________________________________________ >>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>>>> >>>>>>>>> >>>>>>>>> [[alternative HTML version deleted]] >>>>>>>>> >>>>>>>>> ______________________________________________ >>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>> PLEASE do read the posting guide >>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>> >>>>>>>> -- >>>>>>>> Sent from my phone. Please excuse my brevity. >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide >>>>>>> http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.