On Tue, Aug 23, 2011 at 18:56, Victor Stinner <victor.stin...@haypocalc.com> wrote: >> kind=0 is used and public, it's PyUnicode_WCHAR_KIND. Is it still >> necessary? It looks to be only used in PyUnicode_DecodeUnicodeEscape(). > > If it can be removed, it would be nice to have kind in [0; 2] instead of kind > in [1; 2], to be able to have a list (of 3 items) => callback or label.
It is also used in PyUnicode_DecodeUTF8Stateful() and there might be some cases which I missed converting checks for 0 when I introduced the macro. The question was more if this should be written as 0 or as a named constant. I preferred the named constant for readability. An alternative would be to have kind values be the same as the number of bytes for the string representation so it would be 0 (wstr), 1 (1-byte), 2 (2-byte), or 4 (4-byte). I think the value for wstr/uninitialized/reserved should not be removed. The wstr representation is still used in the error case in the utf8 decoder because these strings can be resized. Also having one designated value for "uninitialized" limits comparisons in the affected functions to the kind value, otherwise they would need to check the str field for NULL to determine in which buffer to write a character. > I suppose that compilers prefer a switch with all cases defined, 0 a first > item > and contiguous values. We may need an enum. During the Summer of Code, Martin and I did a experiment with GCC and it did not seem to produce a jump table as an optimization for three cases but generated comparison instructions anyway. I am not sure how much we should optimize for potential compiler optimizations here. Regards, Torsten _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com