The GitHub Actions job "Benchmarks PR Comment" on texera.git/main has succeeded.
Run started by GitHub user ELin2025 (triggered by ELin2025).

Head commit for run:
8001e4c86e8d60971887ad7509b88c42a9fd1ad5 / Yicong Huang 
<[email protected]>
feat(bench): add Arrow Flight E2E benchmark + Benchmarks CI workflow (#5557)

### What changes were proposed in this PR?

A bench-agnostic CI lifecycle that future suites (e.g. JMH for
`ArrowUtils` micros) plug into by appending one line to
`bin/run-benchmarks.sh`, plus the first concrete suite: an end-to-end
Arrow Flight + `PythonWorkflowWorker` micro-bench.

**Lifecycle**

| Trigger | Mode | PR comment | Publish to gh-pages |
|---|---|---|---|
| `pull_request` (label-gated, mirrors `amber-integration`'s set) | `pr`
— 3 configs × 20 batches (~5 min) | ✓ | — |
| `push` to `main` | `pr` (post-merge fast signal) | — | ✓ |
| `schedule` Sundays 08:00 UTC | `full` — 36 configs × 200 batches
(~50-60 min) | — | ✓ |
| `workflow_dispatch` | `full` | — | — |

PR runs upload the bench as an artifact + render a markdown summary
table on the workflow page; the `workflow_run`-triggered `Benchmarks PR
Comment` listener (separate file because `pull_request` from forks gets
a read-only token and zero secret access) downloads the artifact,
sanitizes the CSV, and upserts a single marker-tagged PR comment.
Non-blocking — not part of `required-checks.yml`'s aggregator.

**First benchmark: Arrow Flight E2E (`ArrowFlightActorBench`)**

Spawns a real `PythonWorkflowWorker` actor (real Pekko mailbox + real
`texera_run_python_worker.py` subprocess + real Arrow Flight gRPC) wired
to an identity Python UDF, then times per-batch send→echo round-trip
across a sweep of `batch_size × schema_width × string_len`. Per-config
output: throughput (tuples/s, MB/s), latency p50/p95/p99, total ms. Each
config writes incrementally so a killed sweep still leaves usable
artifacts.

ASF: `benchmark-action/github-action-benchmark` is SHA-pinned to
`52576c92bccf6ac60c8223ec7eb2565637cae9ba` (v1.22.1) per the
apache-infrastructure-actions allow-list.

### Any related issues, documentation, discussions?

Closes #5556

### How was this PR tested?

End-to-end validated on a fork-internal PR —
[Yicong-Huang/texera#17](https://github.com/Yicong-Huang/texera/pull/17)
ran the full `Benchmarks` workflow, the `workflow_run` listener fired,
and a marker-tagged comment landed and upserted across two push cycles
([rendered
example](https://github.com/Yicong-Huang/texera/pull/17#issuecomment-4645589605)).
`workflow_run` only listens on the default branch, so the loop can't be
tested from a non-default branch — that's why the dry-run lived on a
fork; after merge, the same flow takes effect on `apache/texera:main`
automatically.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7)

Report URL: https://github.com/apache/texera/actions/runs/27382772758

With regards,
GitHub Actions via GitBox

Reply via email to