[ Re-trying after the previous massive quoting and line-wrap fail :-/ ] Denis Dzyubenko wrote: > 2011/12/9 João Abecasis <joao.abeca...@nokia.com>: > >> inline QUuid QUuid::createFromName(const QUuid &ns, const > >> QString &name) > >> { > >> return createFromName(ns, name.toUtf8()); > >> } > > > > would only be updated to call the right implementations, as > > appropriate. > > I like the current status of the patch very much. > > However I have one question - where utf8 comes from? Shouldn't it be > defined by rfc, and if not imo we shouldn't arbitrary choose > encodings, and maybe leave the default one in - which is utf-16 for > QString
This is my reasoning: 1) As you mention the RFC doesn't specify encodings. In fact, it says the owner of a namespace is free to decide how it should be used. For this reason it's important that we support QByteArray as the canonical form and let users make conscious decisions. 2) In Qt, strings of text are represented as QString so it would be nice to support QString-based names. This is the reason for adding those overloads as convenience API, but doesn't tell us how QString-based names should be translated to "a canonical sequence of octets" (quoting the standard). 3) The point of name-based UUIDs is that you can regenerate the UUIDs knowing only the namespace UUID and a particular name. If you use the QByteArray version, it's up to you to ensure this. When using the QString version Qt needs to ensure it for you. This excludes locale- and system-dependent conversions, like toLocal8Bit(), it also excludes straightforward utf16() as it is dependent on endianness, and thus platform. 4) UTF-8 is a good candidate because it is one possible "canonical sequence of octets". But it's mostly that, a good candidate. So, there isn't a reason why it *has* to be utf-8, but I haven't seen better alternatives. Other alternatives are toAscii or toLatin1, but they're lossy encodings. Network-byte order UTF-16?... Anyway, one use case mentioned in the standard makes this convenience approach very nice: QUrl url; // ... // NameSpace_DNS from RFC4122 // {6ba7b810-9dad-11d1-80b4-00c04fd430c8} QUuid nsDns(0x6ba7b810, 0x9dad, 0x11d1, 0x80, 0xb4, 0x00, 0xc0, 0x4f, 0xd4, 0x30, 0xc8); QUuid uuidForUrl = QUuid::createFromName(nsDns, url.toString()); With the added benefit that in that use case it interoperates with Python. ("And what does python do?", you ask. Well, it avoids the decision altogether and bails out on unicode strings. It only accepts a byte-strings: $ python Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import uuid >>> uuid.NAMESPACE_DNS UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8') >>> uuid.uuid3(uuid.NAMESPACE_DNS, "www.widgets.com") UUID('3d813cbb-47fb-32ba-91df-831e1593ac29') >>> uuid.uuid3(uuid.NAMESPACE_DNS, u"www.widgets.com") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/uuid.py", line 512, in uuid3 hash = md5(namespace.bytes + name).digest() UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 1: ordinal not in range(128) ) What do others think? Cheers, João _______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development