On Fri, Jun 15, 2018 at 10:43 PM, Even Rouault <even.roua...@spatialys.com> wrote: > > > Thinking about it, I do not want to support approximate statistics, > > therefore something like STATISTICS_VALID_RATIO does not work for me, only > > something like STATISTICS_N_VALID which requires exact statistics. > > STATISTICS_VALID_RATIO makes more sense to me that absolute number of pixels.
OK, considering that approximate statistics need to be supported, something like STATISTICS_VALID_RATIO is the only option. Setting such a metadata item would be relatively easy to implement. > > > Approximate statistics are confusing for users, unless it is made clear > > that these statistics are approximations. > > It is know, since STATISTICS_APPROXIMATE=YES is now set if you compute > approximate statistics. > > > Looking at random samples, the normal assumption must be > > STATISTICS_APPROXIMATE=YES if STATISTICS_APPROXIMATE is not set. IMHO, GDAL > > should set STATISTICS_APPROXIMATE=YES unless GDAL itself has computed exact > > statistics. > > That's what GDAL 2.3.0 now does. Check the output of gdalinfo -stats vs > gdalinfo -approx_stats. I checked, results with gdalinfo -stats are wrong because existing STATISTICS_* metadata are reported even if approximate statistics are not allowed. The problem is, STATISTICS_APPROXIMATE is not set. Other software using GDAL to create raster datasets may use GDALRasterBand::SetStatistics() which does not indicate if stats are approximations., i.e. stats are approximations but there is no STATISTICS_APPROXIMATE=YES. GDAL assumes that STATISTICS_* metadata represent stats on all pixels, this is IMHO wrong. You can only hope that STATISTICS_* metadata represent stats on all pixels if a respective metadata item has been set to boolean true, something like STATISTICS_ALL_PIXELS=YES. Even in this case, an option to force recomputing raster band stats would be very nice to have (verifying metadata). STATISTICS_EXACT is not an option because there are different ways to calculate mean and stddev using a fixed set of values. The different methods are all correct (exact) in their own way, but results may be different. Markus
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev