On Sat, 27 Aug 2011 09:18:14 +0200 "Martin v. Löwis" <mar...@v.loewis.de> wrote: > Am 27.08.2011 08:33, schrieb Terry Reedy: > > On 8/26/2011 9:56 PM, Antoine Pitrou wrote: > > > >> Another "interesting" question is whether it's easy to port to the PEP > >> 393 string representation, if it gets accepted. > > > > Will the re module need porting also? > > That's a quality-of-implementation issue (in both cases). In principle, > the modules should continue to work unmodified, and indeed SRE does. > However, the module will then match on Py_UNICODE, which may be > expensive to produce, and may not meet your expectations of surrogate > pair handling. > > So realistically, the module should be ported, which has the challenge > that matching needs to operate on three different representations. The > modules already support two representations (unsigned char and > Py_UNICODE), but probably switching on type, not on state.
>From what I've seen, re generates two different sets of functions at compile-time (with a stringlib-like approach), while regex has a run-time flag to choose between the two representations (where, interestingly, the two code paths are explicitly spelled, almost duplicate of each other). Matthew, please correct me if I'm wrong. Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com