As discussed in http://lists.osgeo.org/pipermail/gdal-dev/2010-May/024619.html and http://lists.osgeo.org/pipermail/gdal-dev/2010-July/025192.html OGR's shapefile driver does not allow the shapefile's codepage to be set or retrieved using the DBF LDID byte or an *.cpg file.
This functionality is implemented in recent shapelib releases, when creating a new shapefile. Issue #882 http://trac.osgeo.org/gdal/ticket/882 addresses this issue, but the discussion there largely predates RFCs 5 and 23 ( http://trac.osgeo.org/gdal/wiki/rfc5_unicode and http://trac.osgeo.org/gdal/wiki/rfc23_ogr_unicode ). I would be interested in exposing this shapelib feature in OGR. However, there are a number of design decisions to make: 1) Should encoding retrieval and setting be an OGR wide feature, or one specific to the shapefile driver? 2) Should encodings be specified as a string or an enumeration of well-known encodings? If encoding retrieval and setting occurs only at the shapefile driver level, then a string that mimics shapelib's API might be sensible (if the codepage is set to "LDID/n" and -1 < n < 255 then the ldid byte of the dbf is set to the n, otherwise the whole codepage string is written to the .CPG file). Otherwise, commonsense would suggest a standardised enum of encodings might be the way to go. 3) What should the API be? A patch at issue #882 creates two new OGRLayer member functions, GetEncoding() and SetEncoding(), and a GetEncoding() implementation for shapefiles (although it fails to allow the encoding to be set, as far as I can see). As far as I can see, this has some potential problems: a) It exposes these functions for all layers regardless of driver, which may or may not be desirable. b) It assumes that encoding can be set by the layers. Using shapelib, the only way to set the encoding is when the DBF is created. An alternative to the SetEncoding() function might to use a dataset or layer creation option. However, given that AFAIK OGR doesn't support metadata in the same way GDAL does, a means of retrieving the encoding would need to be paired with this. Is this the appropriate place to have this discussion? I would be happy to provide a patch implementing this feature however it is deemed most appropriate. Kind Regards, Francis Markham _______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev