On Sun, 23 Apr 2017, T Lee Davidson wrote: > According to http://gambaswiki.org/wiki/doc/pcre , using "*?" in a regular > expression should lazily match 0 or more characters. However, it appears to > act greedily. > > I am trying to do some very simple HTML tag stripping with > 'Regex.Replace(sText, "<.*?>", "")', and it takes out way more than just the > tags. > > Have I misunderstood the documentation? >
I believe you are correct. I get the same greedy behaviour from "<.*?>". The Gambas wiki page seems to be copied from the libpcre documentation [1] and the point, under QUANTIFIERS: *? 0 or more, lazy hardly gives room for misinterpretation. I just tried the following line: RegExp.Replace("<tag abc=\"xyz\">content</tag>", "<.*>", "", RegExp.Ungreedy) which correctly delivers "content", if you are interested in a workaround. If no one else does it, I can (try to remember to) try to have a look at gb.pcre this evening. Regards, Tobi [1] http://www.pcre.org/current/doc/html/pcre2syntax.html -- "There's an old saying: Don't change anything... ever!" -- Mr. Monk ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Gambas-user mailing list Gambas-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gambas-user