Hi, Out of curiosity, if you isolate the shp, dbf and shx (make a copy) in a separate folder is the data still corrupt?
Rob On Mon, May 15, 2023 at 6:35 AM Andrea Giudiceandrea via gdal-dev < gdal-dev@lists.osgeo.org> wrote: > Hi devs, > in a reent QGIS issue report at > https://github.com/qgis/QGIS/issues/53058 , an user complains about an > ESRI Shapefile layer that was corrupted after an attribute value was > changed and the edit was saved. The corrupted layer is opened by QGIS > without errors or warning being reported, anyway it shows only a subset > of the original feature geometry: a lot of records have now a null > geometry associated, so they cannot be displayed. > > After some investigations, although I don't know why and how the layer > was corrupted, it seems to me that the issue is mostly due to a > corruption of the .idx file: in fact it contains, for various records, > incorrect value of index and length of the record. This generates the > incorrect reading of such record and the following ones, until the the > index in the .idx file and the data in the .shp file line up again. > > Running the QGIS "Repair Shapefile" processing algorithm against such > layer, the algorithm fails while the .idx file is actually updated but > the layer becomes totally invalid and it is not possible to load it in > QGIS. The same happens directly using ogrinfo after the .idx file was > deleted and the SHAPE_RESTORE_SHX variable was set to YES: the .idx file > was recreated but the layer becomes unreadable by both QGIS and ogrinfo. > > Inspecting the .idx file created by ogrinfo with SHAPE_RESTORE_SHX=YES > (which is the same as the one created by the QGIS tool "Repair > Shapefile"), it seems to me ogr fails to properly create the .idx file: > it incorrectly stores, in the index file header, the total length in > 16-bit words of the .shp file instead of the total length in 16-bit > words of the .idx file itself. > In this particular case, > it stores the incorrect value 00 29 2A C2 = 2697922 16-bit words = > 5395844 bytes > instead of the correct value 00 02 1D 26 = 138534 16-bit words = 277068 > bytes > > Changing such incorrect value to the correct one in the repaired .idx > file, makes the layer valid again and showing again the previously > missing feature geometries (with only some glitches and a missing record). > > This behaviour seems weird to me, as I remember that the Repair > Shapefile tool or the SHAPE_RESTORE_SHX=YES setting worked well to > repair Shapefiles with corrupted index in the past. > > Maybe the issue in this particular Shapefile prevent ogr to correctly > repair the index? > For comparison, the old "Shape Checker utility" succeeds to repair the > .idx file: it creates the same .idx file as the one created by ogr, > apart from the total file length value which is correct. > > Any clue as to what may have gone wrong during the layer editing in QGIS > that eventually corrupted the layer? > > > Best regards. > > Andrea Giudiceandrea > _______________________________________________ > gdal-dev mailing list > gdal-dev@lists.osgeo.org > https://lists.osgeo.org/mailman/listinfo/gdal-dev >
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev