On Mon, 24 Apr 2017, Tobias Boege wrote: > On Sun, 23 Apr 2017, T Lee Davidson wrote: > > According to http://gambaswiki.org/wiki/doc/pcre , using "*?" in a regular > > expression should lazily match 0 or more characters. However, it appears to > > act greedily. > > > > I am trying to do some very simple HTML tag stripping with > > 'Regex.Replace(sText, "<.*?>", "")', and it takes out way more than just the > > tags. > > > > Have I misunderstood the documentation? > > > > I believe you are correct. I get the same greedy behaviour from "<.*?>". > The Gambas wiki page seems to be copied from the libpcre documentation [1] > and the point, under QUANTIFIERS: > > *? 0 or more, lazy > > hardly gives room for misinterpretation. I just tried the following line: > > RegExp.Replace("<tag abc=\"xyz\">content</tag>", "<.*>", "", > RegExp.Ungreedy) > > which correctly delivers "content", if you are interested in a workaround. > If no one else does it, I can (try to remember to) try to have a look at > gb.pcre this evening. >
It's still before noon, but I saw that the RegExp.Replace() routine always automatically adds the RegExp.Ungreedy flag to the regular expression. With that in mind, I tried RegExp.Replace(sText, "<.*>", "") and it worked ungreedily. In fact, since the compilation options are always OR'd, my successful pattern above with RegExp.Ungreedy was just an accident and the setting of RegExp.Ungreedy was redundant. The PCRE documentation [1] mentions a fact that escapes the Gambas documentation [2]: PCRE2_UNGREEDY Invert greediness of quantifiers (the Gambas documentation reads like it makes everything ungreedy.) So, the greediness you get is explained, I'll add some bits to the documentation later. Basically, RegExp.Replace() is always ungreedy. You can still get greedy quantifiers by using ungreedy ones in your pattern... Regards, Tobi [1] http://www.pcre.org/current/doc/html/pcre2_compile.html [2] http://gambaswiki.org/wiki/comp/gb.pcre/regexp/ungreedy -- "There's an old saying: Don't change anything... ever!" -- Mr. Monk ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Gambas-user mailing list Gambas-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gambas-user