To partially appease the crowd, the data provider has since acknowledged the issue on their end and are working on a fix - thankfully not one of those providers that take a month to respond with a shrug.
Cheers, Daniel On Tue, 10 Sept 2024 at 16:11, thomas bonfort <thomas.bonf...@gmail.com> wrote: > I'm not sure that providing a fix to work around this very broken behavior > is the best way of action to make them fix their server... > > On Tue, Sep 10, 2024 at 5:07 PM Even Rouault via gdal-dev < > gdal-dev@lists.osgeo.org> wrote: > >> >> Le 10/09/2024 à 16:10, Rahkonen Jukka via gdal-dev a écrit : >> >> Hi, >> >> >> >> Have you tried with configuration option >> “CPL_VSIL_CURL_USE_HEAD=[YES/NO]: Defaults to YES. Controls whether to use >> a HEAD request when opening a remote URL.” >> >> I was just going to suggest that too. It "works", but not really. It just >> postpones the core issue: the server doesn't support GET Range requests, so >> can't be used with /vsicurl/ >> >> As it has a COG organization with overview data first in the file, If you >> want to read the smallest overview(s), you can use /vsicurl_streaming/ >> instead, but that won't be efficient to read the bottom-right most tile of >> the full resoultion late, which will require reading the whole file... >> >> Nothing GDAL can do about that. >> >> Actually... digging further... it somehow supports Range requests, but in >> what I believe a non-compliant way. It does return the expected content, >> but returns HTTP 200 and not HTTP 206 (Partial content). And it never >> returns the Content-Length header. >> >> Well, I've implemented a workaround in >> https://github.com/OSGeo/gdal/pull/10760 that might be useful in other >> similar cases too. >> >> With that, the following works: >> >> gdal_translate >> "/vsicurl?file_size=unlimited&url=https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif" >> --config GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR out.tif -srcwin 5000 5000 >> 50 50 >> >> file_size=unlimited works here since the GTiff driver doesn't really need >> to have the right file size, it will just check we don't try to read beyond >> at some points, so unlimited is OK. In other situations/drivers, the exact >> value could be needed. >> >> But they should really fix their servers >> >> Even >> >> >> >> -Jukka Rahkonen- >> >> >> >> *Lähettäjä:* gdal-dev <gdal-dev-boun...@lists.osgeo.org> >> <gdal-dev-boun...@lists.osgeo.org> *Puolesta *Daniel Evans via gdal-dev >> *Lähetetty:* tiistai 10. syyskuuta 2024 16.57 >> *Vastaanottaja:* 'gdal-dev@lists.osgeo.org' (gdal-dev@lists.osgeo.org) >> <gdal-dev@lists.osgeo.org> <gdal-dev@lists.osgeo.org> >> *Aihe:* [gdal-dev] Ignore content-length in vsicurl? >> >> >> >> Hi all, >> >> >> >> I am attempting to read a dataset via /vsicurl/ where I believe the >> server is incorrectly returning `content-length: 0` in response to HEAD >> requests. This causes GDAL to believe it's a zero-length file, and it >> therefore can't be read. >> >> >> >> If I download the file via HTTP GET, it's valid, and GDAL can read it >> locally. I've also confirmed I can use /vsicurl/ on some test datasets in >> the GDAL repo. >> >> >> >> Is it possible to force GDAL to work around the faulty content-length >> header, or is it too fundamental a problem to ignore? >> >> >> >> I've separately got in touch with the data provider to see if they are >> able to fix the issue at their end. >> >> >> >> Cheers, >> >> Daniel >> >> >> >> >> >> URL of the troublesome dataset: >> >> >> https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif >> >> >> >> >> >> Example HTTP header responses I'm seeing: >> >> >> >> GET >> >> >> >> HTTP/2 200 >> date: Tue, 10 Sep 2024 13:47:54 GMT >> content-type: binary/octet-stream >> content-length: 278198294 >> vary: Origin, Access-Control-Request-Method, >> Access-Control-Request-Headers >> etag: "a79f3f685281d6681e4d362536c5b3eb-34" >> last-modified: Thu, 25 Jul 2024 13:16:08 GMT >> x-version: 0.0.16 >> access-control-allow-credentials: true >> >> >> >> HEAD >> >> >> >> HTTP/2 200 >> date: Tue, 10 Sep 2024 13:48:08 GMT >> content-type: binary/octet-stream >> content-length: 0 >> x-version: 0.0.16 >> access-control-allow-credentials: true >> etag: "a79f3f685281d6681e4d362536c5b3eb-34" >> last-modified: Thu, 25 Jul 2024 13:16:08 GMT >> vary: Origin, Access-Control-Request-Method, >> Access-Control-Request-Headers >> >> _______________________________________________ >> gdal-dev mailing >> listgdal-dev@lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev >> >> -- http://www.spatialys.com >> My software is free, but my time generally not. >> >> _______________________________________________ >> gdal-dev mailing list >> gdal-dev@lists.osgeo.org >> https://lists.osgeo.org/mailman/listinfo/gdal-dev >> >
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev