Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Luke
On Sun, Feb 2, 2014 at 2:12 PM, Jukka Rahkonen <jukka.rahko...@mmmtike.fi>wrote: > Hi, > > I made a few tests and here comes my conclusions. Hypothesis is that > someone > wants to make a DEM query service which is using gdallocationinfo for > queries and DEM data is to be accessed as files from a standard web site. I > compared three alternatives: > 1) There are thousands of DEM files on the server and they are combined > together with a VRT file. > 2) There is only one DEM file as BigTIFF. > 3) DEM is split into tiles into x/y/z tile directory structure like in > Google maps or OpenStreetMap tiles. > > My test data covers Finland with 10 m grid size and as deflate compressed > tiffs they make about 10 GB together. > > Before going on, keep in mind that the speed needs indexes. The better > index, the less unnecessary data to read. In case 1) the first level index > is the VRT file. The second level index, if it exists, is in the headers of > the real DEM files. It may be possible to jump to a correct offset from the > beginning of the DEM data and read only a part of the file. In case 2) the > index is in the internal TIFF directory. If the BigTIFF is tiled the access > to tiles should be rather effectice. And finally in case 3) the index is > built into directory structure and tiling schema that is used for saving > the > tiles. The schema is no well known that tile map service clients can > directly ask for a certain file name if they know the coordinates and > scale. > > Conclusions: > > 1) > - The whole VRT file must be readed. Caching the vrt file would make next > requests faster. > - For some reason gdallocationinfo wants to get the directory list of the > directory where the vrt file is. This is slow and generates lots of traffic > if the thousands of DEM files are in the same directory. Probably it would > be faster to have them in another dierectory. > > 2) > - BigTIFF route is more straight forward but gdallocationinfo needs still > to > do many big range reads. > - Also in this case gdallocationinfo reads the target file directory. It > would be good to keep this directory small. Don't do like I did with having > in the directory the BigTIFF DEM file that was the only file needed, but > also the vrt and thousands of original DEMs from the previuos test -> but > at > least this is a know this issue now and know how to avoid it. In my case > reading the directory made 2.2 MB of web traffic and all or most for wain. > > 3) > - I used OpenStreetMap tile service as the test data for the third test. In > this case gdallocationinfo knows exactly which tile to request and it is > making only one request. It also seems to cache some tiles on the client > side which means that queries for close locations may hit the cached tile > and be very fast. > > Summary statistics: > > 1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data > 2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data > 3) Gdallocationinfo makes 1 requests and reads 10 kB of data > > Requests I used are these: > > 1) > gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ > dem10m/dem_10m.vrt -geoloc 389559 6677412 > 2) > gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ > dem10m/dem_10m.tif -geoloc 389559 6677412 > 3) > gdallocationinfo frmt_wms_openstreetmap_tms.xml -geoloc 389559 6677412 > > I know that the queried place in 3) is not the same because SRIDs of data > differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead > but it does not matter here, the idea is what is important. > > My conclusion is that you should cut your DEM into tiles with for example > gdal2tiles or MapTiler and the resuld could actually be quit speedy and > perhaps using 126x126 tiles could make it still a bit faster. Hope that > they > can create tiles as 16-bit tiffs. > > I am sure that these results are not scientifically sound but I am also > sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think > about especially if you dream about a mobile service. > > I placed the requests which gdallocationinfo made during these tests into > http://latuviitta.org/documents/gdallocationinfo_requests.txt > > -Jukka Rahkonen- > > > _______________________________________________ > gdal-dev mailing list > gdal-dev@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/gdal-dev >
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev