This is an automated email from the ASF dual-hosted git repository.
apitrou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 00439d1048 GH-47655: [C++][Parquet][CI] Fix failure to generate seed
corpus (#47656)
00439d1048 is described below
commit 00439d104804fca889016d054c03ff6ba9d5560f
Author: Antoine Pitrou <[email protected]>
AuthorDate: Thu Sep 25 22:24:53 2025 +0200
GH-47655: [C++][Parquet][CI] Fix failure to generate seed corpus (#47656)
### Rationale for this change
On OSS-Fuzz, generating the Parquet seed corpus would trigger a
multiplication overflow when converting a Arrow seconds timestamp column to a
Parquet milliseconds timestamp column.
### What changes are included in this PR?
Reduce range of input values when writing timestamps to the Parquet seed
corpus.
### Are these changes tested?
Manually.
### Are there any user-facing changes?
No.
* GitHub Issue: #47655
Authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
---
cpp/src/parquet/arrow/generate_fuzz_corpus.cc | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/cpp/src/parquet/arrow/generate_fuzz_corpus.cc
b/cpp/src/parquet/arrow/generate_fuzz_corpus.cc
index acee0d0ff9..2be025471c 100644
--- a/cpp/src/parquet/arrow/generate_fuzz_corpus.cc
+++ b/cpp/src/parquet/arrow/generate_fuzz_corpus.cc
@@ -147,8 +147,13 @@ Result<std::shared_ptr<RecordBatch>> ExampleBatch1() {
{name_gen(), gen.Decimal32(decimal32(7, 3), kBatchSize,
kNullProbability)});
// Timestamp
+ // (Parquet doesn't have seconds timestamps so the values are going to be
+ // multiplied by 10)
+ auto int64_timestamps_array =
+ gen.Int64(kBatchSize, -9000000000000000LL, 9000000000000000LL,
kNullProbability);
for (auto unit : TimeUnit::values()) {
- ARROW_ASSIGN_OR_RAISE(auto timestamps, int64_array->View(timestamp(unit,
"UTC")));
+ ARROW_ASSIGN_OR_RAISE(auto timestamps,
+ int64_timestamps_array->View(timestamp(unit,
"UTC")));
columns.push_back({name_gen(), timestamps});
}
// Time32, time64