A little note on quoting in regular expressions. I find writing \\. when I want a quoted . somewhat confusing, so I would use the pattern "_w_.*[.]csv$".
Better still, if you want to match file names, there is a function glob2rx that converts shell ("glob") patterns into regular expression patterns. Thus > grep(glob2rx("*_w_*.csv"), myfiles, value=TRUE) [1] "BU-072_1_E1_RE_SEC-01_local_w_0.2_0.2.csv" [2] "BU-072_1_E1_RE_SEC-01_local_w_0.2_0.6.csv" [3] "BU-072_1_E1_RE_SEC-01_local_w_0.4_1.0.csv" [4] "BU-072_1_E1_RE_SEC-01_local_w_1.0_0.2.csv" [5] "BU-072_1_E1_RE_SEC-01_local_w_1.0_0.6.csv" [6] "BU-072_1_E1_RE_SEC-01_local_w_1.0_1.0.csv" So the simplest way to get what you want is CSVs <- list.files(path=..., pattern=glob2rx("*_w_*.csv")) In fact ?list.files mentions glob2rx. On Tue, 17 Sep 2019 at 18:49, Ivan Calandra <calan...@rgzm.de> wrote: > Dear useRs, > > I still have problems using regular expressions. I have two problems for > which I have found workarounds, but I'm sure there are better ways of > doing it. > > 1) list CSV files with "_w_" in the name > > Here is a sample of the files in the folder: > myfiles <- c("BU-072_1_E1_RE_SEC-01_local_a_0.2_0.2.csv", > "BU-072_1_E1_RE_SEC-01_local_a_0.2_0.6.csv","BU-072_1_E1_RE_SEC-01_local_a_0.4_1.0.csv", > > "BU-072_1_E1_RE_SEC-01_local_a_1.0_0.2.csv","BU-072_1_E1_RE_SEC-01_local_a_1.0_0.6.csv", > > "BU-072_1_E1_RE_SEC-01_local_w_0.2_0.2.csv","BU-072_1_E1_RE_SEC-01_local_w_0.2_0.6.csv", > > "BU-072_1_E1_RE_SEC-01_local_w_0.4_1.0.csv","BU-072_1_E1_RE_SEC-01_local_w_1.0_0.2.csv", > > "BU-072_1_E1_RE_SEC-01_local_w_1.0_0.6.csv","BU-072_1_E1_RE_SEC-01_local_w_1.0_1.0.csv", > > "BU-072_1_E1_RE_SEC-01_local_a_0.2_0.2.xls","BU-072_1_E1_RE_SEC-01_local_a_0.2_0.6.xls", > > "BU-072_1_E1_RE_SEC-01_local_a_0.4_1.0.xls","BU-072_1_E1_RE_SEC-01_local_a_1.0_0.2.xls", > > "BU-072_1_E1_RE_SEC-01_local_a_1.0_0.6.xls","BU-072_1_E1_RE_SEC-01_local_w_0.2_0.2.xls", > > "BU-072_1_E1_RE_SEC-01_local_w_0.2_0.6.xls","BU-072_1_E1_RE_SEC-01_local_w_0.4_1.0.xls", > > "BU-072_1_E1_RE_SEC-01_local_w_1.0_0.2.xls","BU-072_1_E1_RE_SEC-01_local_w_1.0_0.6.xls", > > "BU-072_1_E1_RE_SEC-01_local_w_1.0_1.0.xls") > > Here is what I did: CSVs <- list.files(path=..., pattern="\\.csv$") > w.files <- CSVs[grep(pattern="_w_", CSVs)] > > Of course, what I would like to do is list only the interesting files > from the beginning, rather than subsetting the whole list of files. In > other words, having a pattern that includes both "\\.csv$" and "_w_" in > the list.files() call. I tried "_w_&\\.csv$" but it returns an empty > vector. > > 2) The units of the variables are given in the original headers. I would > like to extract the units. This is what I did: headers <- c("dist to > origin on curve [mm]","segment on section [mm]", "angle 1 [degree]", > "angle 2 [degree]","angle 3 [degree]") units.var <- > gsub(pattern="^.*\\[|\\]$", "", headers) > > It seems to be to overly complicated using gsub(). Isn't there a way to > extract what is interesting rather than deleting what is not? > > Thank you for your help! Best, Ivan > > -- > Dr. Ivan Calandra > TraCEr, laboratory for Traceology and Controlled Experiments > MONREPOS Archaeological Research Centre and > Museum for Human Behavioural Evolution > Schloss Monrepos > 56567 Neuwied, Germany > +49 (0) 2631 9772-243 > https://www.researchgate.net/profile/Ivan_Calandra > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.