Le 03/05/2023 à 14:22, Moises Calzado via gdal-dev a écrit :
Hi Even,

Thanks so much for taking a look into that one!

I have one doubt regarding the CSVT content, as we're not really using it, but it's required when using the GEOMETRY_NAME layer creation option, as can be checked in the CSV driver documentation:

     *

        GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
        column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES.
        Defaults to WKT

We really need this flag as we are processing files that contain geometries with different column names, and we always want the same geometry name in the generated output. Are we losing something when using that flag to avoid this problem?

The reason  for requiring CREASE_CSVT=YES is that when reading back a .csv without a .csvt the geometry column must be named WKT. Unless you specify the GEOM_POSSIBLE_NAMES open option (which must have been a later addition). That said it could be reasonable to relax that coupling and allow GEOMETRY_NAME without CREATE_CSVT=YES, with a warning in the doc about the consequence I just mentioned before

In my humble opinion, generating an invalid CSV when using the -lco CREATE_CSVT=YES looks like a bug for me,

Are you speaking about emitting the .prj and .csvt content when writing to /vsistdout ? Yes, I'd tend to agree they should not be emitted in that mode.

as I can't see the reason why strings containing line breaks can't be quoted.
I'm not following you about the issue with line breaks. In my previous message, I showed I didn't reproduce any issue: the CSV driver emits fields with double quotes, even when there are line breaks. Can you be more specific about what's wrong ? I don't see the connection with GEOMETRY_NAME.

Could you please shed some light on this?

Looking forward to your reply,
Regards.

El mié, 3 may 2023 a las 14:00, Even Rouault (<even.roua...@spatialys.com>) escribió:

    you didn't post to the list

    Le 03/05/2023 à 13:49, Moises Calzado a écrit :
    Hi Even,

    Thanks so much for taking a look into that one!

    I have one doubt regarding the CSVT content, as we're not really
    using it, but it's required when using the GEOMETRY_NAME layer
    creation option, as can be checked in the CSV driver documentation:

         *

            GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of
            geometry column. Only used if GEOMETRY=AS_WKT and
            CREATE_CSVT=YES. Defaults to WKT

    We really need this flag as we are processing files that contain
    geometries with different column names, and we always want the
    same geometry name in the generated output. Are we losing
    something when using that flag to avoid this problem?
    In my humble opinion, generating an invalid CSV when using the
    -lco CREATE_CSVT=YES looks like a bug for me, as I can't see the
    reason why strings containing line breaks can't be quoted.

    Could you please shed some light on this?

    Looking forward to your reply,
    Regards.

    El sáb, 29 abr 2023 a las 15:44, Even Rouault
    (<even.roua...@spatialys.com>) escribió:

        Moises,

        as far as I can see with your example, the CSV driver behaves
        "properly" in reading and writing of field values with line
        breaks.

        It follows the "Fields with embedded line breaks must be
        quoted" rule of
        https://en.wikipedia.org/wiki/Comma-separated_values

        $ ogr2ogr out.csv /vsizip/dataframe.zip

        $ cat out.csv
        id,descriptio
        "1",This is my third row
        "2","this is
        my string
        "
        "3",This is my third row

        $ ogrinfo out.csv -al
        INFO: Open of `out.csv'
              using driver `CSV' successful.

        Layer name: out
        Geometry: None
        Feature Count: 3
        Layer SRS WKT:
        (unknown)
        id: String (0.0)
        descriptio: String (0.0)
        OGRFeature(out):1
          id (String) = 1
          descriptio (String) = This is my third row

        OGRFeature(out):2
          id (String) = 2
          descriptio (String) = this is
        my string


        OGRFeature(out):3
          id (String) = 3
          descriptio (String) = This is my third row

        But in your example using /vsistdout/ and -lco
        CREATE_CSVT=YES is going to result in an invalid CSV file
        which will mix both the .csvt and .csv content

        Even

        Le 24/04/2023 à 13:34, Moises Calzado via gdal-dev a écrit :
        Hello!

        We're trying to convert a Shapefile into a CSV using ogr2ogr
        and we're having some issues while dealing with some columns
        that contain line breaks inside their values. If we have a
        line with the following string, ogr2ogr detects that the
        line break is a new line and it returns two lines.

            "this is my \n value"


        That's the command that we're executing:

            ogr2ogr -f CSV -skipfailures -makevalid /vsistdout/
            /vsizip/shapefile.zip -simplify 0.00001 -dim XY -t_srs
            EPSG:4326 -lco GEOMETRY=AS_WKT -lco GEOMETRY_NAME=geom
            -lco CREATE_CSVT=YES > result.csv


        Is this an expected behaviour, or is there any way to avoid
        this?
        Sharing an example Shapefile so that you can try to
        reproduce that behaviour:
        
https://drive.google.com/file/d/1gFqfTP02KTFoavJyyO-Ix05YwZB2tS24/view?usp=sharing

        Thanks so much in advance,
        Regards.

-- *Moises Calzado*

        Support Engineer

        +34671264286 | mcalz...@carto.com | CARTO
        <https://www.carto.com/>

        <https://spatial-data-science-conference.com/2023/london/>

        _______________________________________________
        gdal-dev mailing list
        gdal-dev@lists.osgeo.org
        https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- http://www.spatialys.com
        My software is free, but my time generally not.



-- *Moises Calzado*

    Support Engineer

    +34671264286 | mcalz...@carto.com | CARTO <https://www.carto.com/>

    <https://spatial-data-science-conference.com/2023/london/>

-- http://www.spatialys.com
    My software is free, but my time generally not.



--
*Moises Calzado*

Support Engineer

+34671264286 | mcalz...@carto.com | CARTO <https://www.carto.com/>

<https://spatial-data-science-conference.com/2023/london/>

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

--
http://www.spatialys.com
My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to