Hi, On Sun, 2015-06-14 at 01:08:29 +0200, Thomas Goirand wrote: > On 06/13/2015 10:55 AM, Paul Wise wrote: > > On Sat, Jun 13, 2015 at 4:23 PM, Thomas Goirand wrote: > >> I've been using xz compression for a long time, but I see a big defect > >> which is today pushing me to turn it off for the .orig.tar file. The > >> issue is that depending on the version of xz-utils, it produces a > >> different output.
Well if you want reproducible output, then use the same tool version. That's the equivalent of expecting that using a different gcc version will give you the same output. As long as the bitstream is compatible with previous versions, I don't see it as a problem, and I'd expect such changes to be beneficial, because say, they might allow making the encoder faster, or compress better, etc. > >> We use "git archive" within the PKG OpenStack team to generate this > >> tarball (which is more or less the same as pristine-tar, except we use > >> upstream tags rather than a pristine-tar branch). The fact that xz > >> produces a different result makes it not reproducible. As a > >> consequence, it is very hard for us to use this system across > >> distributions (ie: use that in both Debian and Ubuntu, or in Sid & > >> Jessie). We need consistency. If you generate it once, as part of the release process, why do you need to generate it on different systems with different versions? And how does that have anything to do with what gets packaged in Debian. For Debian you only need to generate it once, why would you want to generate it anew every time you build a new Debian revision instead of just reusing the same tarball that is on the archive, if you don't keep source tarball releases around? > >> As a friend puts it: > >> > >> "This is a fundamental problem/defect with xz. This (and a lot of > >> other such defects, e.g. non-robustness of xz archives that easily > >> lead to file corruption etc) are the reason that there is lzip (and > >> which is why gnu.org has, on a technical basis, decided that lzip is > >> official gzip-successor for gnu software releases when they come in > >> tarballs). TBH this smells like FUD. For example I've never heard of corruption in .xz files due to non-robustness, I'd expect that corruption to come from external forces, and that integrity would help or not detect it. In any case .xz supports CRC32, CRC64 and SHA-256 for integrity checks, .lz only supports CRC32. More over lzip was created to overcome limitations in the .lzma format, .xz came later and fixed the limitations of the .lzma format too. (And I could probably switch dpkg-deb's .xz integrity check to CRC64, given that's the xz-utils command-line tool default.) Also many GNU projects do not release lzip tarballs, but do release bzip or xz ones and there are very few that exclusively release lzip tarballs. If that's the equivalent of bazaar being the official GNU VCS that most of the GNU projects do not use, well… Actually where is the gnu.org decision documented? I don't see it neither in the GCS, the “Information for Maintainers of GNU Software”, nor in the ftp.gnu.org site. And automake still defaults to dist-gz in latest git. <http://www.gnu.org/prep/standards/> <http://www.gnu.org/prep/maintain/> > >> So it'd be super nice to have LZIP support in dpkg, and use that > >> instead of xz, archive wide. > >> > >> Your thoughts everyone? Is there any reason why we wouldn't do that? Yes, replacing xz with lzip on .deb or .dsc packages does not make any sense. Adding lzip support for source packages *might* make some sense, as I pointed out in the bug report. But doing so does have a very high cost: <https://wiki.debian.org/Teams/Dpkg/FAQ#Q:_Can_we_add_support_for_new_compressors_for_.dsc_packages.3F> Whenever considering to add a new compressor, all surrounding tools need to be modified to support it as well: <https://wiki.debian.org/Teams/Dpkg/DebSupport> <https://wiki.debian.org/Teams/Dpkg/DscSupport> That's a non-zero amount of work and time, and that does not take into account external tools and users. It would also not be usable until the next stable release. Also notice that for example there are still tools that do not support data.tar.xz in .deb, which has been the default for a while, which should give you an idea of what it takes. Adding a new compressor, that does not bring any significant benefit in compression ratio, speed or container format, that is either not widely used or widely available in many systems, just for the benefit of very few packages that might be releasing as well in other formats, or that can be easily recompressed, still does not seem worth it, no. I've yet to see an actual convincing argument why this would be worth the effort and trouble. Also not to mention that I was the first to also consider .lz when we evaluated adding .xz support in dpkg back in 2009. <https://lists.debian.org/debian-dpkg/2009/10/msg00029.html> > > It was already rejected by the dpkg maintainers twice. > > > > https://bugs.debian.org/600094 > > https://bugs.debian.org/556960 > > Reading these bugs, am I right that the archive already supports lzip > for the orig.tar file? Because that's my issue: I don't really mind if > we use xz for the compression of the .deb files, but I need consistency > when generating the orig.tar. Nothing in the .deb/.dsc tooling supports lzip AFAIK. The archive does not even support the .lzma format. > Now, regarding the fact that the maintainer closed the bugs, I see 2 > issues the way he did it. First, that was a bug report from *2009/2010*. I think I was clear in my mail that I was open to reconsider if things changed in the future. > 1/ First, he sites the fact that lzip isn't popular enough as the only > reason (did I miss another point of argumentation?). Well, it's > backed-up by the GNU project as the successor of gzip, and also, I > believe Debian is influential enough so that we may not have to care > about it. Also, a wise technical choice of this kind shouldn't be driven > by a popularity contest. No, that's the summary that Antonio wrote. It's not the only reason I gave in that mail, it's a significant one, given its implications (see the FAQ entry above): * There's already .xz support (as one of the lzma variants), .lzma is now deprecated for .deb compression. * I'd rather have consistency between source and binary compressors. * For source packages high usage might be a more important reason to _accept_ lzip (given that've got an equivalent or better lzma format with .xz), than low usage for a _reject_ (if we didn't have .xz). Compressor formats are subject to network-effects like many other file formats. In this case I think .xz "won" both because it was the "official" successor from .lzma, and because it is superior to .lz. Depending on the context, availability and usage (or popularity if you will), are quite important aspects when deciding when to support such formats. In other cases, you really want to support more format, for example on a GUI archiving program, or on something like automake. Discounting this as a simple matter of "fashion" is not helpful. > 2/ Guillem wrote "that's at the maintainer's discretion" (ie: to close > the bug). Well, here, the whole of Debian is depending on this kind of > decision, so I don't agree that this decision is only at the discretion > of the maintainer. That was exclusively related to whether to keep a wishlist+wontfix report open or closed. And of course the logical next step is instead to force the issue through the ctte… while I've only seen lzip upstream and one other person clamoring for lzip support, and no other dicussions in debian-devel over this, since 2010. > Therefore, I'm tempted to raise this to the technical committee (putting > their list as Cc). Does anyone see a reason why I am mistaking here? *Sigh* and yes… Regards, Guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150614034559.ga10...@gaara.hadrons.org