Thanks for the discussion. In fact, in addition to these compressed files, there are other compressed files, such as: src/bin/gpfdist/regress/data/gpfdist2/lineitem.tbl.bz2,
I suggest keeping them. These files are specifically used to test the ability of gpfdist to read compressed files, and the original test is located at: src/bin/gpfdist/regress/data/gpfdist2/regress/input/gpfdist2.source. Best regards, Max Yang On Fri, Nov 14, 2025 at 2:03 PM Dianjin Wang <[email protected]> wrote: > Hi all, > > During the recent review, I found several `*.gz` files in the repo > that are actually plain-text test data: > > ``` > ./gpMgmt/demo/gppkg/sample-sources.tar.gz > ./contrib/formatter_fixedwidth/data/fixedwidth_small_correct.tbl.gz > ./src/bin/gpfdist/regress/data/gpfdist2/gz_multi_chunk_2.tbl.gz > ./src/bin/gpfdist/regress/data/gpfdist2/lineitem.tbl.gz > ./src/bin/gpfdist/regress/data/gpfdist2/gz_multi_chunk.tbl.gz > ./src/bin/gpfdist/regress/data/exttab1/nation.tbl.gz > ``` > > These are not binary artifacts, but compressed text files used for > tests. For ASF compliance, should we remove them, or keep them and add > a note in README.apache.md? > > I personally prefer to keep them and provide a short explanation in > README.apache.md. > > Any thoughts? > > > Best, > Dianjin Wang > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
