subject:"\[R\] how to separate string from numbers in a large txt file"

Re: [R] how to separate string from numbers in a large txt file

2019-05-19 Thread Boris Steipe

Inline > On 2019-05-19, at 18:11, Michael Boulineau > wrote: > > For context: > >> In gsub(b, "\\1<\\2> ", a) the work is done by the backreferences \\1 and >> \\2. The expression says: >> Substitute ALL of the match with the first captured expression, then " <", >> then the second capture

Re: [R] how to separate string from numbers in a large txt file

2019-05-19 Thread Michael Boulineau

For context: > In gsub(b, "\\1<\\2> ", a) the work is done by the backreferences \\1 and > \\2. The expression says: > Substitute ALL of the match with the first captured expression, then " <", > then the second captured expression, then "> ". The rest of the line is >not > substituted and appe

Re: [R] how to separate string from numbers in a large txt file

2019-05-19 Thread Boris Steipe

Inline ... > On 2019-05-19, at 13:56, Michael Boulineau > wrote: > >> b <- "^([0-9-]{10} [0-9:]{8} )[*]{3} (\\w+ \\w+)" > > so the ^ signals that the regex BEGINS with a number (that could be > any number, 0-9) that is only 10 characters long (then there's the > dash in there, too, with the 0-

Re: [R] how to separate string from numbers in a large txt file

2019-05-19 Thread Michael Boulineau

> b <- "^([0-9-]{10} [0-9:]{8} )[*]{3} (\\w+ \\w+)" so the ^ signals that the regex BEGINS with a number (that could be any number, 0-9) that is only 10 characters long (then there's the dash in there, too, with the 0-9-, which I assume enabled the regex to grab the - that's between the numbers in

Re: [R] how to separate string from numbers in a large txt file

2019-05-19 Thread Boris Steipe

Inline > On 2019-05-18, at 20:34, Michael Boulineau > wrote: > > It appears to have worked, although there were three little quirks. > The ; close(con); rm(con) didn't work for me; the first row of the > data.frame was all NAs, when all was said and done; You will get NAs for lines that can'

Re: [R] how to separate string from numbers in a large txt file

2019-05-18 Thread Michael Boulineau

It appears to have worked, although there were three little quirks. The ; close(con); rm(con) didn't work for me; the first row of the data.frame was all NAs, when all was said and done; and then there were still three *** on the same line where the ï»¿ was apparently deleted. > a <- readLines ("h

Re: [R] how to separate string from numbers in a large txt file

2019-05-18 Thread Boris Steipe

This works for me: # sample data c <- character() c[1] <- "2016-01-27 09:14:40 started a video chat" c[2] <- "2016-01-27 09:15:20 https://lh3.googleusercontent.com/"; c[3] <- "2016-01-27 09:15:20 Hey " c[4] <- "2016-01-27 09:15:22 ended a video chat" c[5] <- "2016-01-27 21:07:11 started a v

Re: [R] how to separate string from numbers in a large txt file

2019-05-18 Thread Michael Boulineau

Going back and thinking through what Boris and William were saying (also Ivan), I tried this: a <- readLines ("hangouts-conversation-6.csv.txt") b <- "^([0-9-]{10} [0-9:]{8} )[*]{3} (\\w+ \\w+)" c <- gsub(b, "\\1<\\2> ", a) > head (c) [1] "ï»¿2016-01-27 09:14:40 *** Jane Doe started a video chat"

Re: [R] how to separate string from numbers in a large txt file

2019-05-17 Thread Boris Steipe

Don't start putting in extra commas and then reading this as csv. That approach is broken. The correct approach is what Bill outlined: read everything with readLines(), and then use a proper regular expression with strcapture(). You need to pre-process the object that readLines() gives you: rep

Re: [R] how to separate string from numbers in a large txt file

2019-05-17 Thread Michael Boulineau

Very interesting. I'm sure I'll be trying to get rid of the byte order mark eventually. But right now, I'm more worried about getting the character vector into either a csv file or data.frame; that way, I can be able to work with the data neatly tabulated into four columns: date, time, person, comm

Re: [R] how to separate string from numbers in a large txt file

2019-05-17 Thread Jeff Newmiller

If byte order mark is the issue then you can specify the file encoding as "UTF-8-BOM" and it won't show up in your data any more. On May 17, 2019 12:12:17 PM PDT, William Dunlap via R-help wrote: >The pattern I gave worked for the lines that you originally showed from >the >data file ('a'), bef

Re: [R] how to separate string from numbers in a large txt file

2019-05-17 Thread Ivan Krylov

On Fri, 17 May 2019 11:36:22 -0700 Michael Boulineau wrote: > So, who knows what happened with the ï»¿ at the beginning of [1] > directly above. perl -Mutf8 -MEncode=encode,decode -Mcharnames=:full \ -E'say charnames::viacode ord decode utf8 => encode latin1 => "ï»¿"' # ZERO WIDTH NO-BREAK SPA

Re: [R] how to separate string from numbers in a large txt file

2019-05-17 Thread William Dunlap via R-help

The pattern I gave worked for the lines that you originally showed from the data file ('a'), before you put commas into them. If the name is either of the form "" or "***" then the "(<[^>]*>)" needs to be changed so something like "(<[^>]*>|[*]{3})". The " ï»¿" at the start of the imported data m

Re: [R] how to separate string from numbers in a large txt file

2019-05-17 Thread Michael Boulineau

This seemed to work: > a <- readLines ("hangouts-conversation-6.csv.txt") > b <- sub("^(.{10}) (.{8}) (<.+>) (.+$)", "\\1,\\2,\\3,\\4", a) > b [1:84] And the first 85 lines looks like this: [83] "2016-06-28 21:02:28 *** Jane Doe started a video chat" [84] "2016-06-28 21:12:43 *** John Doe ended

Re: [R] how to separate string from numbers in a large txt file

2019-05-17 Thread William Dunlap via R-help

Consider using readLines() and strcapture() for reading such a file. E.g., suppose readLines(files) produced a character vector like x <- c("2016-10-21 10:35:36 What's your login", "2016-10-21 10:56:29 John_Doe", "2016-10-21 10:56:37 Admit#8242", "October 23, 1819

Re: [R] how to separate string from numbers in a large txt file

2019-05-16 Thread David Winsemius

On 5/16/19 3:53 PM, Michael Boulineau wrote: OK. So, I named the object test and then checked the 6347th item test <- readLines ("hangouts-conversation.txt) test [6347] [1] "2016-10-21 10:56:37 Admit#8242" Perhaps where it was getting screwed up is, since the end of this is a number (8242)

Re: [R] how to separate string from numbers in a large txt file

2019-05-16 Thread Michael Boulineau

OK. So, I named the object test and then checked the 6347th item > test <- readLines ("hangouts-conversation.txt) > test [6347] [1] "2016-10-21 10:56:37 Admit#8242" Perhaps where it was getting screwed up is, since the end of this is a number (8242), then, given that there's no space between the

Re: [R] how to separate string from numbers in a large txt file

2019-05-16 Thread David Winsemius

On 5/16/19 12:30 PM, Michael Boulineau wrote: Thanks for this tip on etiquette, David. I will be sure and not do that again. I tried the read.fwf from the foreign package, with a code like this: d <- read.fwf("hangouts-conversation.txt", widths= c(10,10,20,40),

Re: [R] how to separate string from numbers in a large txt file

2019-05-16 Thread Michael Boulineau

Thanks for this tip on etiquette, David. I will be sure and not do that again. I tried the read.fwf from the foreign package, with a code like this: d <- read.fwf("hangouts-conversation.txt", widths= c(10,10,20,40), col.names=c("date","time","person","comment"),

Re: [R] how to separate string from numbers in a large txt file

2019-05-15 Thread David Winsemius

On 5/15/19 4:07 PM, Michael Boulineau wrote: I have a wild and crazy text file, the head of which looks like this: 2016-07-01 02:50:35 hey 2016-07-01 02:51:26 waiting for plane to Edinburgh 2016-07-01 02:51:45 thinking about my boo 2016-07-01 02:52:07 nothing crappy has happened, not reall

[R] how to separate string from numbers in a large txt file

2019-05-15 Thread Michael Boulineau

I have a wild and crazy text file, the head of which looks like this: 2016-07-01 02:50:35 hey 2016-07-01 02:51:26 waiting for plane to Edinburgh 2016-07-01 02:51:45 thinking about my boo 2016-07-01 02:52:07 nothing crappy has happened, not really 2016-07-01 02:52:20 plane went by pretty fast,

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

Re: [R] how to separate string from numbers in a large txt file

[R] how to separate string from numbers in a large txt file

21 matches

Site Navigation

Mail list logo

Footer information