Hi,

I'm considering removing from the source tree a number of third-party libraries that we have vendored over the years: zlib, libpng, libjpeg, giflib, liblerc

All of them are widely available in most packaging environments. In that list, only zlib is required (currently either as external or internal lib).

I believe the main reason for having them vendored is now mostly historical, dating back to times where there was no packaging system on Windows. Now we have Conda-Forge or vcpkg, it is easy to have those dependencies installed.

For libjpeg, there was a particular history related to 12-bit JPEG support that required to use the internal copy built in a special way, but a couple months ago, libjpeg-turbo 3.0 has been released with unified support for both 8-bit and 12-bit JPEG in the same build, and latest libtiff and GDAL releases are able to make use of it. Hence this justification no longer holds. Furthermore GDAL libjpeg copy is still good-ol' libjpeg 6b, without all the SIMD accelerations that are now in libjpeg-turbo, hence it is definitely not recommended any more to use GDAL internal copy of libjpeg.

For internal libpng, we have a small old patch to accept some invalid files ("Make screwy MSPaint "zero chunks" only a warning, not error", https://trac.osgeo.org/gdal/ticket/3416). I don't think it is critical to have that patch lost... At worse, it could be attempted to have it accepted by upstream.

Benefits of un-vendoring those libraries:

- currently, we must take care of updating them regularly, in particular to make sure they integrate the latest fixes for their vulnerabilities.

- they complicate the GDAL build scripts and configuration. For example, drivers can't be built as plugins if they depend of one of those libraries built as internal (because the internal copy is built in libgdal, but not exported, hence a plugin can't use its symbols, and thus must be built itself in libgdal core). We also must do tricks to rename their symbols to avoid clashes when integrating GDAL with other software which uses the corresponding external library.

- they require exceptions to static analyzers (cppcheck, coverity scan), since they don't use the same coding standards as GDAL

Looking a bit around in different open source build recipees of GDAL (Debian, Conda-Forge, vcpkg, OSGeo4W, gisinternals, rasterio-wheels), those proposed changes should have modest impact, as they already mostly use external libraries. What I've identified (I may have missed things) to require changes from the maintainers of those distributions to keep the same level of functionality:

- gisinternals doesn't seem to have a liblerc build

- rasterio-wheels doesn't seem to have libpng and giflib builds

As far as our code base is concerned, apart from the obvious removal of code and simplification of the build system, there would be some changes in CI configurations (like the Android CI build would be impacted to add at least a preliminary step of cross-compiling zlib).


Potential candidates, but would remain in-tree for now:

- libtiff: compulsory dependency. GDAL has been the main driver for most libtiff development over the last 10 years, and GDAL autotest suite tortures libtiff much more than libtiff own testsuite, hence it is quite convenient to have the capability of vendoring it.  Plus the fact that for "staging codecs" (that is codecs not yet integrated in official libtiff), currently JPEG-XL (a few years ago this was the LERC codec), we can't build them against an external libtiff.

- libgeotiff. compulsory dependency. If one uses internal libtiff, one also must use internal libgeotiff because of the renaming of symbols done when using internal libtiff.

- shapelib. compulsory dependency.  External default shapelib build uses 32-bit file offset, whereas the internal shapelib is built with 64-bit offset support (to use .DBF files > 2 GB). We don't have build support for using it as external lib.

- libjson-c: compulsory dependency. I initially put it in the list of candidates to unvendor, as it is quite widely available, but now I recall that upstream libjson-c has an issue (especially/only on Windows) with non-C locales when parsing/outputing floating point numbers, which we have patched in our internal copy by using GDAL locale-safe functions. Ideally that should be fixed upstream, but not immediately trivial to port our changes.

- libqhull (used for the gdal_grid linear algorithm, which requires a Delaunay triangulation of the points): that one could be a candidate for unvendoring, as it is available in a number of distributions, but there's an issue currently which scipy which bundles it without renaming the symbols, hence if linking GDAL against external libqhull, and using GDAL + scipy, we have a clash of symbols (https://github.com/conda-forge/qgis-feedstock/issues/284#issuecomment-1356490896). When using internal libqhull, GDAL does rename its symbols, which works around this (scipy) issue.

Non-candidates:

- pcidsk sdk (for PCIDSK driver): doesn't seem to be packaged. We don't have build support for using it as external lib.

- libopencad (used by CAD driver): doesn't seem to be packaged

- libcsf (used by PCRaster driver): doesn't seem to be packaged

- infback9. That code originally comes from the "contrib" part of zlib, to add Deflate64 support (non-backwards compatible extension of Deflate, sometimes used by Windows zipper I believe). We don't have build support for using it as external lib.

- degrib and g2clib (used by the GRIB driver): they originally came from third-party sources, but they aren't widely packaged and we have heavily patched them (there was no real possibility of collaboration with the authors of those software at the time where we needed to make those changes). We don't have build support for using them as external libraries. For better or worse, they should be considered as GDAL code now...

- hdf-eos (used by HDF4 driver): originally comes from a third-party source, but GDAL copy was heavily patched long time ago. We don't have build support for using it as external lib. For better or worse, it should be considered as GDAL code now...


Thoughts ? (given the length of the email, it should probably be formalized as a RFC. I'll do that, unless there is a massive uprising against the proposal...)

Even

--
http://www.spatialys.com
My software is free, but my time generally not.

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to