You probably know this, but there is an option to let gdalwarp use more
cores: -wo NUM_THREADS=ALL_CPUS. It gives some improvement, but not
really staggering. Splitting up operations over individual tiles would
really fasten up things. Even if I use only one VM, I can define 32
cores, and it would certainly be interesting to experiment with programs
like MPI to integrate multiple VMs into one computing cluster.
Jan
On 01/12/2013 02:38 AM, Kennedy, Paul wrote:
Hi,
Yes, we are pretty sure we will see a significant benefit. The
processing algorithms are CPU bound not io bound. Our digital terrain
model interpolations often run for many hours ( we do them overnight)
but the underlying file is only a few gigabytes. If we split them into
multiple files of tiles and run each on a dedicated process the whole
thing is quicker, but this is messy and results in a stitching error.
Another example is gdalwarp. It takes quite some time with a large
data set and would be. A good candidate for parallelisation, as would
gdaladdo.
I believe slower cores but more of them in pcs are the future. My pc
has 8 but they rarely get used to their potential.
I am certain there are some challenges here, that's why it is
interesting;)
Regards
pk
On 11/01/2013, at 6:54 PM, "Even Rouault"
<even.roua...@mines-paris.org <mailto:even.roua...@mines-paris.org>>
wrote:
Re: [gdal-dev] does gdal support multiple simultaneous writers to raster
Hi,
This is an intersting topic, with many "intersecting" issues to deal
with at
different levels.
First, are you confident that in the use cases you imagine that I/O
access won't
be the limiting factor, in which case serialization of I/O could be
acceptable
and this would just require an API with a dataset level mutex.
There are several places where parallel write should be addressed :
- The GDAL core mechanisms that deal with the block cache
- Each GDAL driver where parallel write would be supported. I guess
that GDAL
drivers should advertize a specific capability
- The low-level library used by the driver. In the case of GDAL, libtiff
And finally, as Frank underlined, there are intrinsic limitations due
to the
format itself. For a compressed TIFF, at some point, you have to
serialize the
writing of the tile, because you cannot kown in advance the size of the
compressed data, or at least have some coordination of the writers so
that a
"next offset available" is properly synchronized between them. The
compression
itself could be serialized.
I'm not sure however if what Jan mentionned, different process,
writing the same
dataset is doable.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev