Hi,

Am I right that .idx is an attribute index? By the documentation it feels 
somehow odd
" Currently the OGR Shapefile driver only supports attribute indexes for 
looking up specific values in a unique key column. To create an attribute index 
for a column issue an SQL command of the form "CREATE INDEX ON tablename USING 
fieldname". To drop the attribute indexes issue a command of the form "DROP 
INDEX ON tablename". The attribute index will accelerate WHERE clause searches 
of the form "fieldname = value". The attribute index is actually stored as a 
mapinfo format index and is not compatible with any other shapefile 
applications."

Restoring .SHX is not related at all.

-Jukka Rahkonen-

-----Alkuperäinen viesti-----
Lähettäjä: gdal-dev <gdal-dev-boun...@lists.osgeo.org> Puolesta Andrea 
Giudiceandrea via gdal-dev
Lähetetty: maanantai 15. toukokuuta 2023 16.34
Vastaanottaja: gdal-dev@lists.osgeo.org
Aihe: [gdal-dev] Shapefile with corrupted index: SHAPE_RESTORE_SHX=YES doesn't 
correctly repairs it.

Hi devs,
in a reent QGIS issue report at
https://github.com/qgis/QGIS/issues/53058 , an user complains about an ESRI 
Shapefile layer that was corrupted after an attribute value was changed and the 
edit was saved. The corrupted layer is opened by QGIS without errors or warning 
being reported, anyway it shows only a subset of the original feature geometry: 
a lot of records have now a null geometry associated, so they cannot be 
displayed.

After some investigations, although I don't know why and how the layer was 
corrupted, it seems to me that the issue is mostly due to a corruption of the 
.idx file: in fact it contains, for various records, incorrect value of index 
and length of the record. This generates the incorrect reading of such record 
and the following ones, until the the index in the .idx file and the data in 
the .shp file line up again.

Running the QGIS "Repair Shapefile" processing algorithm against such layer, 
the algorithm fails while the .idx file is actually updated but the layer 
becomes totally invalid and it is not possible to load it in QGIS. The same 
happens directly using ogrinfo after the .idx file was deleted and the 
SHAPE_RESTORE_SHX variable was set to YES: the .idx file was recreated but the 
layer becomes unreadable by both QGIS and ogrinfo.

Inspecting the .idx file created by ogrinfo with SHAPE_RESTORE_SHX=YES (which 
is the same as the one created by the QGIS tool "Repair Shapefile"), it seems 
to me ogr fails to properly create the .idx file:
it incorrectly stores, in the index file header, the total length in 16-bit 
words of the .shp file instead of the total length in 16-bit words of the .idx 
file itself.
In this particular case,
it stores the incorrect value 00 29 2A C2 = 2697922 16-bit words =
5395844 bytes
instead of the correct value 00 02 1D 26 = 138534 16-bit words = 277068 bytes

Changing such incorrect value to the correct one in the repaired .idx file, 
makes the layer valid again and showing again the previously missing feature 
geometries (with only some glitches and a missing record).

This behaviour seems weird to me, as I remember that the Repair Shapefile tool 
or the SHAPE_RESTORE_SHX=YES setting worked well to repair Shapefiles with 
corrupted index in the past.

Maybe the issue in this particular Shapefile prevent ogr to correctly repair 
the index?
For comparison, the old "Shape Checker utility" succeeds to repair the .idx 
file: it creates the same .idx file as the one created by ogr, apart from the 
total file length value which is correct.

Any clue as to what may have gone wrong during the layer editing in QGIS that 
eventually corrupted the layer?


Best regards.

Andrea Giudiceandrea
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to