The GitHub Actions job "Required Checks" on texera.git/main has succeeded. Run started by GitHub user github-merge-queue[bot] (triggered by github-merge-queue[bot]).
Head commit for run: 439ea72e46b78aec7f71e8889225f2c90942a2c2 / Anish Shivamurthy <[email protected]> feat(huggingface): add audio and media generation tasks (#5570) ## What changes were proposed in this PR? Adds the audio and media-generation task families — 5 HF pipeline tasks — as new `TaskCodegen`s plugged into the dispatcher established by the text-generation PR: audio tasks: `automatic-speech-recognition`, `audio-classification`, `text-to-speech` media-generation tasks: `text-to-image`, `text-to-video` `codegen/AudioTaskCodegen.scala` supplies the per-task payload + parse Python branches for the 3 audio tasks. `codegen/MediaGenCodegen.scala` supplies the per-task payload + parse Python branches for the 2 media-generation tasks. `CodegenContext` is extended with `audioInput` + `inputAudioColumn` (`EncodableString`). `HuggingFaceInferenceOpDesc.scala` gains 2 new `@JsonProperty` fields and registers `AudioTaskCodegen` + `MediaGenCodegen` in the dispatcher. `PythonCodegenBase.scala` grows to host the shared audio/media infrastructure: - Audio task-family tuple (`audio_only_tasks`) in `process_table`. - Per-row audio-byte resolution from upload or column input. - Raw binary request handling for `automatic-speech-recognition` and `audio-classification`. - JSON payload handling for `text-to-speech`. - Provider-specific routing for media generation and audio generation through `_call_provider`, including OpenAI-compatible image/audio endpoints where supported. - Response parsing for audio/media outputs, including data-URL conversion for generated media URLs. - Media helper support for converting remote URLs into `data:image/...`, `data:audio/...`, or `data:video/...` URLs where needed. - Hardened audio input loading to match the image-input path: uploaded audio is accepted as a data URL, remote audio is fetched through the existing HTTPS-only `_fetch_remote_url` helper, and arbitrary worker-local file paths are no longer read. User-input strings continue to flow through `pyb"..."` + `EncodableString` so they reach Python as `self.decode_python_template('<base64>')` rather than raw literals. `PythonCodeRawInvalidTextSpec` still passes with 117/117 descriptors py_compile cleanly. ## Any related issues, documentation, or discussions? Tracking issue: Add audio and media-generation task families to HuggingFace operator apache#5288 Closes apache#5288 Stacked on: Add image task family (`ImageTaskCodegen`) to HuggingFace operator / `hf/03-image-tasks` Parent issue: Add Hugging Face inference operator apache#5041 Closed sibling issue: Add HuggingFaceModelResource REST endpoints for HF operator UI apache#5134 ## How was this PR tested? `sbt "WorkflowOperator/compile; WorkflowOperator/Test/compile"` clean. `sbt scalafmtCheck` clean. `sbt "WorkflowOperator/testOnly org.apache.texera.amber.operator.huggingFace.HuggingFaceInferenceOpDescSpec org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"` — 26 focused tests pass, including HuggingFace audio/media task coverage and the raw Python descriptor scan. `sbt "WorkflowOperator/testOnly org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"` — 117/117 descriptors py_compile cleanly with the new operator code paths, no marker leaks. - Added regression coverage that audio remote input routes through `_fetch_remote_url(audio_input)` and no longer uses raw `requests.get(audio_input)` or local file reads. ## Was this PR authored or co-authored using generative AI tooling? Yes, co-authored with generative AI tooling (Codex). Report URL: https://github.com/apache/texera/actions/runs/27988100482 With regards, GitHub Actions via GitBox
