On Tue, Aug 23, 2011 at 08:15, Antoine Pitrou <solip...@pitrou.net> wrote: > So why would you need three separate implementation of the unrolled > loop? You already have a macro named WRITE_FLEXIBLE_OR_WSTR.
The WRITE_FLEXIBLE_OR_WSTR macro does a check for kind and then writes. Using this macro for the fast path would be inefficient, to have a real fast path, you would need a outer if to check for kind and then in each condition body the matching access to the string (1, 2, or 4 bytes) and for each body also write 4 or 8 times (guarded by #ifdef, depending on platform). As all these cases bloated up the C code, we went for the simple solution with the goal of profiling the code again afterwards to see where the new performance bottlenecks would be. > Even without taking into account the unrolled loop, I wonder how much > slower UTF-8 decoding becomes with that approach, by the way. Instead of > testing the "kind" variable at each loop iteration, using a > stringlib-like approach may be a better deal IMO. To me this feels like this would complicate the C source code and decrease readability. For each function you would need a wrapper which does the kind checking logic and then, in a separate file, the implementation of the function which then gets included three times for each character width. Regards, Torsten _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com