?regexp ## Search the text on "backreference" .(or websearch it: "regular expression backreference")
-- Bert On Tue, Sep 17, 2019 at 7:52 AM Ivan Calandra <calan...@rgzm.de> wrote: > Thank you Bert. > That's more like what I was looking for. > > Could you please tell me where I can find information on the "\\1"? This > is the part I still don't get. > > Ivan > > -- > Dr. Ivan Calandra > TraCEr, laboratory for Traceology and Controlled Experiments > MONREPOS Archaeological Research Centre and > Museum for Human Behavioural Evolution > Schloss Monrepos > 56567 Neuwied, Germany > +49 (0) 2631 9772-243https://www.researchgate.net/profile/Ivan_Calandra > > On 17/09/2019 16:42, Bert Gunter wrote: > > (For the units) > > Why not simply: > > sub(".*\\[(.+)\\]","\\1", headers) > > Cheers, > Bert > > > On Tue, Sep 17, 2019 at 6:40 AM Ivan Calandra <calan...@rgzm.de> wrote: > >> Thank you Ivan for your help! >> >> Your solution for the first problem is so simple I didn't even think >> about it! >> What I find weird is that "_w_|\\.csv$" works as expected ("OR"), but is >> there no way to combine two patterns with an "AND"? >> >> Your solution to the second problem is actually unfortunately even more >> complicated to me than the gsub() solution. But I'm glad I can learn >> about regmatches() and regexpr()! >> >> Best, >> Ivan >> >> -- >> Dr. Ivan Calandra >> TraCEr, laboratory for Traceology and Controlled Experiments >> MONREPOS Archaeological Research Centre and >> Museum for Human Behavioural Evolution >> Schloss Monrepos >> 56567 Neuwied, Germany >> +49 (0) 2631 9772-243 >> https://www.researchgate.net/profile/Ivan_Calandra >> >> On 17/09/2019 09:14, Ivan Krylov wrote: >> > On Tue, 17 Sep 2019 08:48:43 +0200 >> > Ivan Calandra <calan...@rgzm.de> wrote: >> > >> >> CSVs <- list.files(path=..., pattern="\\.csv$") >> >> w.files <- CSVs[grep(pattern="_w_", CSVs)] >> >> >> >> Of course, what I would like to do is list only the interesting files >> >> from the beginning, rather than subsetting the whole list of files. >> > One way to express that would be "_w_.*\\.csv$", meaning that the >> > filename has to have "_w_" in it, followed by anything (any character >> > repeated any number of times, including 0), followed by ".csv" at the >> > end of the line. >> > >> >> 2) The units of the variables are given in the original headers. I >> >> would like to extract the units. This is what I did: headers <- >> >> c("dist to origin on curve [mm]","segment on section [mm]", "angle 1 >> >> [degree]", "angle 2 [degree]","angle 3 [degree]") units.var <- >> >> gsub(pattern="^.*\\[|\\]$", "", headers) >> >> >> >> It seems to be to overly complicated using gsub(). Isn't there a way >> >> to extract what is interesting rather than deleting what is not? >> > Pure-R way: use regmatches() + regexpr(). Both regmatches and regexpr >> > take the character vector as an argument, so duplication is hard to >> > avoid: >> > >> > units <- regmatches(headers, regexpr('\\[.*\\]', headers)) >> > >> > The stringr package has an str_match() function with a nicer interface: >> > str_match(headers, '\\[.*\\]') -> units. >> > >> > Such "greedy" patterns containing ".*" present a few pitfalls, e.g. >> > looking for text in parentheses using the pattern "\\(.*\\)" in >> > "...(abc)...(def)..." will match the whole "(abc)...(def)" instead of >> > single groups "(abc)" and "(def)", but with your examples the pattern >> > should work as presented. One other option would be to ask for "[", >> > followed by zero or more characters that are not "]", followed by "]": >> > '\\[[^]]*\\]'. >> > >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.