Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-19 Thread Graeme B. Bell
>> Whenever you deal with national scale data for any country with coastline, >> you frequently end up with an absolutely gigantic and horrifically complex >> single polygon which depicts the coastline and all the rivers throughout the >> country as a single continuous edge. This mega-polygon, s

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-15 Thread David Strip
On 1/13/2015 2:37 AM, Graeme B. Bell wrote: > Whenever you deal with national scale data for any country with coastline, > you frequently end up with an absolutely gigantic and horrifically complex > single polygon which depicts the coastline and all the rivers throughout the > country as a sing

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-14 Thread David Strip
I ran a test case on my Windows 7 laptop (i7, quad core (not that it matters), 2.4 GHz, 8G RAM). Input file was geotiff, 29847x33432, paletted 8-bit, 11 landcover classes. This dataset covers the city limits of Philadelphia, PA, so the polygon representing the Delaware River runs approximately from

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-13 Thread Jukka Rahkonen
Graeme B. Bell skogoglandskap.no> writes: > It would be great if the people behind gdal_polygonise could put some thought into this extremely common > situation for anyone working with country or continent scale rasters to make sure that it is handled well. > It has certainly affected us a great

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-13 Thread Graeme B. Bell
>> >> The reason for so many reads (though 2.3 seconds out of "a few hours" is >> negligible overhead) is that the algorithm operates on a pair of adjacent >> raster lines at a time. This allows processing of extremely large images >> with very modest memory requirements. It's been a while since I

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-12 Thread chris snow
Hi David, Thanks for the response. I'll feed your question about converting the shapefile to geojson back to the team. In the meantime, I have also received some more info on your previous questions: "The input file was 1.4GB, the output geojson was around 17GB IIRC. The raster file contains a

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-12 Thread David Strip
Your team writes that the image is usually exported as a vector file, eg shapefile. Can they do this successfully for the 1.4GB image? If so, have you tried just converting the shapefile to geojson? Might be the simplest solution. If that doesn't work, you could try tiling, as you mention. As Even

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-12 Thread Even Rouault
Chris, As underlined by David, the time spent in raster I/O is presumably neglectable and not the issue here. How many polygons were generated in this execution ? A good way of identifying a bottleneck is to run the process under gdb and regularly break with Ctrl+C and display the backtrace, and t

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-12 Thread chris snow
Hi David, Thanks for your response. I have a little more information since feeding your response to the project team: "The tif file is around 1.4GB as you noted and the data is similar to that of the result of an image classification where each pixel value is in a range between (say) 1-5. After

Re: [gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-11 Thread David Strip
I'm surprised at your colleague's experience. We've run some polygonize on large images and have never had this problem. The g2.2xlarge instance is overkill in the sense that the code is not multi-threaded, so the extra CPUs don't help. Also, as you have already determine

[gdal-dev] gdal_polygonize.py TIF to JSON performance

2015-01-11 Thread chris snow
I have been informed by a colleague attempting to convert a 1.4GB TIF file using gdal_polygonize.py on a g2.2xlarge Amazon instance (8 vCPU, 15gb RAM) that the processing took over 2 weeks running constantly. I have also been told that the same conversion using commercial tooling was completed in