Hi all, I've been experiencing some behavior using the GDAL python bindings where I am occasionally seeing what appears to be random blocks of the tiff being unwritten in geotiffs I've pushed to S3. a small block(s) in one of the bands will be all zeros while everywhere else is good.
My setup is that I have a thread pool crunching through some gdal.Warp calls. The main thread is polling for completed jobs and then uploading the file to s3. My theory is that Python's garbage collector hasn't destroyed the dataset I've set to None before I start uploading. Is this plausible? The call to FlushCache didn't solve the problem for me and I'm not aware of another way via the Python bindings for ensure the dataset is closed. I'm using Ubuntu 19.10 (comes with GDAL 2.4.2), any thoughts and ideas to try are greatly appreciated, as one can imagine, this is hard to reproduce. The code looks something like this: def warp_tile(f_in, f_out, warp_opts): gdal_warp_opts = gdal.WarpOptions(**warp_opts, creationOptions=["TILED=YES", "COMPRESS=DEFLATE"]) try: warp_ds = gdal.Warp(f_out, f_in, options=gdal_warp_opts) warp_ds.FlushCache() finally: warp_ds = None with ThreadPoolExecutor(max_workers=max_workers) as executor: job_d = {} for job in jobs: job_d[executor.submit(warp_tile, job.in_f, job.out_f, job.warp_opts)] = out_f for future in as_completed(job_d): out_f = job_d[future] try: future.result() except Exception as e: ... else: boto3.resource('s3').Bucket(bucket_name).upload_file(Filename=out_f, Key=key) Thanks, Patrick
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev