Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?
On Thu, 21 Mar 2019 02:07:01 +0100 Victor Stinner wrote: > Le lun. 18 mars 2019 à 23:41, Raymond Hettinger > a écrit : > > The code in the current 3.8 alpha differs from 3.7 in that it removes > > attribute sorting and instead preserves the order the user specified when > > creating an element. As far as I can tell, there is no objection to this > > as a feature. > > By the way, what's the rationale of this backward incompatible change? > > I found this short message: > "FWIW, this issue arose from an end-user problem. She had a hard > requirement to show a security clearance level as the first attribute. > We did find a work around but it was hack." > https://bugs.python.org/issue34160#msg338098 > > It's the first time that I hear an user asking to preserve attribute > insertion order (or did I miss a previous request?). Technically, it > was possible to implement the feature earlier using OrderedDict. So > why doing it now? > > Is it really worth it to break Python backward compatibility (change > the default behavior) for everyone, if it's only needed for few users? The argument you're making is weird here. If only "a few users" need a deterministic ordering of XML attributes, then compatibility is broken only for "a few users", not for "everyone". Most users and applications should /never/ care about the order of XML attributes. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Reproducible output (Re: Is XML serialization output guaranteed to be bytewise identical forever?)
On Thu, 21 Mar 2019 01:46:14 +0100 Victor Stinner wrote: > > Getting the same output on Python 3.7 and Python 3.8 is also matter > for https://reproducible-builds.org/ If you want reproducible output, you should settle on a well-known version of Python. You don't expect two different versions of gcc to produce the exact same binary. Even compression utilities (such as gzip, xz...) can get improvements over time that change the binary output, for example making it smaller. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?
On Thu, Mar 21, 2019 at 11:33 AM Antoine Pitrou wrote: > [...] > > Most users and applications should /never/ care about the order of XML > attributes. > > Regards > > Antoine > Especially as the standards specifically say that ordering has no semantic impact. Byte-by-byte comparison of XML is almost always inappropriate. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?
On Thu, 21 Mar 2019 at 17:05, Steve Holden wrote: > > On Thu, Mar 21, 2019 at 11:33 AM Antoine Pitrou wrote: >> >> [...] >> >> Most users and applications should /never/ care about the order of XML >> attributes. >> >> Regards >> >> Antoine > > > Especially as the standards specifically say that ordering has no semantic > impact. > > Byte-by-byte comparison of XML is almost always inappropriate. Conversely, if ordering has no semantic impact, there's no real justification for asking for the current order to be changed. In practice, allowing the user to control the ordering (by preserving input order) gives users a way of handling (according to the standard) broken consumers who ascribe semantic meaning to the attribute order. So there's a small benefit for real-world users having to deal with non-compliant software. But that benefit is by definition small, as standards-compliant software won't be affected. The cost of making the change to projects that rely on the current output is significant, and that should be considered. But there's also the question of setting a precedent. If we do reject this change because of the cost to 3rd parties, are we then committing Python to guaranteeing sorted attribute order (and worse, byte-for-byte reproducible output) for ever - a far stronger commitment than the standards require of us? That seems to me to be an extremely bad precedent to set. There's no good answer here - maybe a possible compromise would be for us to document explicitly in 3.8 that output is only guaranteed identical to the level the standards require (i.e., attribute order is not guaranteed to be preserved) and then make this change in 3.9. But in practice, that's not really any better for projects like coverage - it just delays the point when they have to bite the bullet (and it's not like 3.8 is imminent - there's plenty of time between now and 3.8 without adding an additional delay). Reluctantly, I think I'd have to say that I don't think we should reject this change simply because existing users rely on the exact output currently being produced. To mitigate the impact on 3rd parties, it would be very helpful if we could add to the stdlib some form of "compare two XML documents for semantic equality up to the level that the standards require". 3rd party code could then use that if it's present, and fall back to byte-equality if it's not. If we could get something like that for 3.9, but not for 3.8, then that would seem to me to be a good reason to defer this change until 3.9 (because we don't want to have 3.8 being an exception where there's no semantic comparison function, but the byte-equality fallback doesn't work - that's just needlessly annoying). Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?
On Thu, Mar 21, 2019, 1:05 PM Steve Holden wrote: > On Thu, Mar 21, 2019 at 11:33 AM Antoine Pitrou > wrote: > >> [...] >> >> Most users and applications should /never/ care about the order of XML >> attributes. >> >> Regards >> >> Antoine >> > > Especially as the standards specifically say that ordering has no semantic > impact. > When you have a lot of attributes, though, sometimes having them in a particular defined order can make it easier to reason about and make sense of the code when manually reviewing it. > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?
Victor Stinner schrieb am 21.03.19 um 01:22: > Alternatives have been proposed like a recipe to sort node attributes > before serialization, but honestly, it's way too complex. Hm, really? Five lines of simple and obvious Python code, that provide a fast and completely Python-version agnostic solution to the problem that a few users have, are "way too complex" ? That sounds a bit extreme to me. > I don't want > to have to copy such recipe to every project. Add a new function, > import it, use it where XML is written into a file, etc. Taken alone, > maybe it's acceptable. But please remember that some companies are > still porting their large Python 2 code base to Python 3. This new > backward incompatible gets on top of the pile of other backward > incompatible changes between 2.7 and 3.8. > > I would prefer to be able to "just add" sort=True. Don't forget that > tests like "if sys.version >= (3, 8):" will be needed which makes the > overall fix more complicated. Yes, exactly! Users would have to add that option *conditionally* to their code somewhere. Personally, I really dislike having to say "if Python version is X do this, otherwise, do that". I prefer a solution that just works. There are at least four approaches that generally work across Python releases: ignoring the ordering, using C14N, creating attributes in order, sorting attributes before serialisation. I'd prefer if users picked one of those, preferably the right on for their use case, rather than starting to put version specific kludges into their code. Stefan ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove tempfile.mktemp()
On 3/20/19, Greg Ewing wrote: > Antoine Pitrou wrote: > >> How is it more secure than using mktemp()? > > It's not, but it solves the problem someone suggested of another > program not being able to access and/or delete the file. NamedTemporaryFile(delete=False) is more secure than naive use of mktemp(). The file is created exclusively (O_EXCL). Another standard user can't overwrite it. Nor can another standard user delete it if it's created in the default temp directory (e.g. POSIX "/tmp" has the sticky bit set). mkstemp() is similar but lacks the convenience and reliable resource management of a Python file wrapper. There's still the problem of accidental name collisions with other processes that can access the file, i.e. processes running as the same user or, in POSIX, processes running as the super user. I saw a suggestion in this thread to increase the length of the random sequence from 8 characters up to 22 characters in order to make this problem extremely improbable. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?
On 3/21/2019 1:23 PM, Paul Moore wrote: On Thu, 21 Mar 2019 at 17:05, Steve Holden wrote: Especially as the standards specifically say that ordering has no semantic impact. Byte-by-byte comparison of XML is almost always inappropriate. Conversely, if ordering has no semantic impact, there's no real justification for asking for the current order to be changed. In practice, allowing the user to control the ordering (by preserving input order) gives users a way of handling (according to the standard) broken consumers who ascribe semantic meaning to the attribute order. Or, as Jonathan Goble said elsewhere, use an order that makes whatever sense to the author and other readers. The order of positional parameter names in a function definition has no semantic meaning to python, but it would be terrible to make them be sorted. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Removing PendingDeprecationWarning
Hi, all. I'm thinking about removing PendingDeprecationWarning. (previous discussion: https://discuss.python.org/t/pendingdeprecationwarning-is-really-useful/1038) It was added "not be printed by default" version of DeprecationWarning. But DeprecationWarning is not printed by default now. We use PendingDeprecationWarning for N-2 release, and change it to DeprecationWarning for N-1 release. But this workflow seems not worth enough for now. I want to stop using PendingDeprecationWarning for new deprecation. More aggressively, I want to remove PendingDeprecationWarning class, and `PendingDeprecationWarning = DeprecationWarning` for backward compatibility. How do you think? May I do it in Python 3.8? -- Inada Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com