Greetings Chris, > Well, that would be something I'd want to discuss here. As I'm not > sure if I actually ~want~ to match the API of the re module.
If this feature is considered a good addition for the standard library, integrating it on re would be an interesting option. But given what you say above, I'm not sure if *you* want to make it a part of re itself. [...] > IMO If you don't bother to name a group then you probably aren't going > to be interested in it anyway - so why keeping a reference to it? That's not true. There's a lot of code out there using unnamed groups genuinely. The syntax (?: ) is used when the group content is considered unuseful. > If you only wanted to extract the numbers from those verses... > > >>> regex='^(((?P<number>\d+) ([^,]+))(, )?)*$' > >>> pat2=re2.compile(regex) > >>> x=pat2.extract(buf) > >>> x > {'number': ['12', '11', '10']} > > Before the compression stage the _Match object actually looked like this: > > {'_group0': {'_value': '12 drummers drumming, 11 pipers piping, 10 > lords [...] > '10'}}]}} > > But the compression algorithm collected the named groups and brought > them to the surface, to return the much nicer looking: > > {'number': ['12', '11', '10']} I confess I didn't thought about how that could be cleanly implemented, but both outputs you present above look inadequate in my opinion. Regular expressions already have a widely adopted meaning. If we're going to introduce new features, we should try to do that without breaking the current well known meanings they have. > > I find the feature very interesting, but being used to live without it, > > I have difficulty evaluating its usefulness. > > Yes - this is a good point too, because it ~is~ different from the re > library. re2 aims to do all that searching, grouping, iterating and > collecting and constructing work for you. [...] > Actually, I ~would~ like to limit it to just named groups. > I reckon, if you're not going to bother naming a group, then why would > you have any interest in it. > I guess its up for discussion how confusing this "new" way of thinking > could be and what drawbacks it might have. Your target seems to be a new kind of regular expressions indeed. In that case, I'm not sure if "re2" is the right name for it, given that you haven't written an improved SRE, but a completely new kind of regular expression matching which depends on SRE itself rather than extending it on a compatible way. While I would like to see *some* kind of successive matching implemented in SRE (besides the Scanner which is already available), I'm not in favor of that specific implementation. I'm open to discuss that further. -- Gustavo Niemeyer http://niemeyer.net _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com