Just wanted to follow up on this because I didn't want to leave the impression 
that the issue was resolved, or limited to jpeg compression. It seems to be an 
issue with writing the internal mask band.

When I create a mask band in a large lzw-compressed or jpeg-compressed tif 
using the COG driver it dramatically increases processing time over writing 
RGBA (hours instead of minutes), so the issue is not jpeg compression, it's the 
creation of the mask band. Steps to reproduce:

  1.  Take a decent-sized RGBA LZW tiff
  2.  Generate a LZW COG with -b 1 -b 2 -b 3 -b 4 -config 
GDAL_TIFF_INTERNAL_MASK YES and time it
  3.  Generate a LZW COG with -b 1 -b 2 -b 3 -mask 4 -config 
GDAL_TIFF_INTERNAL_MASK YES and time it
  4.  Compare times. When I do this with a sourcefile that's RGBA COG 
102600x91100 my time doing (1) is about 2 minutes and my time doing (2) is 
about 120 minutes

I also noticed that the -co BIGTIFF=NO option appears to be ignored in the COG 
driver. I can share a file if that's helpful (can not provide a link on the 
listserv)

Is there a faster way to generate an external nodata mask then add it? From 
reading the GTIFF and COG format notes on internal masks it wasn't clear but I 
didn't see a way to specify copying masks in the COG driver.

From: Ritchie, Andrew C
Sent: Wednesday, April 15, 2020 12:26 PM
To: Even Rouault <even.roua...@spatialys.com>; gdal-dev@lists.osgeo.org
Subject: RE: [EXTERNAL] Re: [gdal-dev] gdal_translate (3.1.0dev) "never" 
finishes on large jpeg cogs... REALLLLLY long time to unload.

Hi Even,

Thanks for the quick response! The source dataset is a LZW cog with RGBA, and I 
confirmed (I think) that the issue was the mask layer by playing with the 
switches I used to generate the LZW cog - I didn't even have to do a JPEG COG. 
I can cause the same, or very similar behavior, by changing from:

-b 1 -b 2 -b 3 -b 4
to:
-b 1 -b 2 -b 3 -mask 4

with GDAL_TIFF_INTERNAL_MASK YES.

With the -b 4 switch (or omitting all -b and -mask switches), I get LZW cogs in 
2 minutes. With -mask 4 I get hung up at 20% with directory thrashing messages 
in debug for at least 30 minutes, and I'm guessing I'll get the same behavior 
at the "done" message if I care to wait.

Below are the two configurations that show such a difference in performance for 
me. I didn't play around with CACHEMAX or MAX_DATASET_POOL_SIZE, was trying to 
keep it simple.

2 minute TIFFs:
gdal_translate <infile> <outfile> -b 1 -b 2 -b 3 -b 4 -of COG -co COMPRESS=LZW 
-co PREDICTOR=2 -co NUM_THREADS=ALL_CPUS -co RESAMPLING=AVERAGE -config 
GDAL_TIFF_INTERNAL_MASK YES -config GDAL_TIF_OVR_BLOCKSIZE 128

A couple orders of magnitude longer:
gdal_translate <infile> <outfile> -b 1 -b 2 -b 3 -mask 4 -of COG -co 
COMPRESS=LZW -co PREDICTOR=2 -co NUM_THREADS=ALL_CPUS -co RESAMPLING=AVERAGE 
-config GDAL_TIFF_INTERNAL_MASK YES -config GDAL_TIF_OVR_BLOCKSIZE 128

From: Even Rouault 
<even.roua...@spatialys.com<mailto:even.roua...@spatialys.com>>
Sent: Wednesday, April 15, 2020 4:38 AM
To: gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org>
Cc: Ritchie, Andrew C <aritc...@usgs.gov<mailto:aritc...@usgs.gov>>
Subject: [EXTERNAL] Re: [gdal-dev] gdal_translate (3.1.0dev) "never" finishes 
on large jpeg cogs... REALLLLLY long time to unload.


Andrew,

Has your source raster an alpha band ? That could explain the difference since 
it isn't possible to directly create a YCbCrA JPEG compressed file, but 
internally a mask band must be created. However I wouldn't anticipate such a 
huge difference in performance between compression schemes. I would suggest not 
setting GDAL_CACHEMAX at all and letting it at its 5% default (increasing it is 
not always a good idea), in case it would be a performance issue at 
de-allocating cached blocks.

Even

--

Spatialys - Geospatial professional services

http://www.spatialys.com
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to