[PR] feat(huggingFace): add image task family via ImageTaskCodegen [texera]

via GitHub Tue, 02 Jun 2026 18:22:28 -0700


PG1204 opened a new pull request, #5320:
URL: https://github.com/apache/texera/pull/5320


   > ⚠️ This PR is stacked on #5278. Until that lands, the diff below also 
includes #5278's operator + codegen + spec changes. The new code in this PR is 
`codegen/ImageTaskCodegen.scala`, the image-related additions to 
`codegen/PythonCodegenBase.scala`, the new image fields on 
`HuggingFaceInferenceOpDesc.scala`, the frontend image-upload component, and 
the image-task tests in `HuggingFaceInferenceOpDescSpec.scala`. Once #5278 
merges, this diff will auto-clean to ~856 lines.
   
   ### What changes were proposed in this PR?
   
   Adds the image task family — 9 HF pipeline tasks — as the second 
`TaskCodegen` plugged into the dispatcher established by #5278:
   
   image-only: image-classification, object-detection, image-segmentation, 
image-to-text
   image + prompt: visual-question-answering, document-question-answering, 
zero-shot-image-classification, image-text-to-text, image-to-image
   
   - `codegen/ImageTaskCodegen.scala` supplies the per-task payload + parse 
Python branches for all 9 tasks.
   - `TaskCodegen` trait gains a `tasks: Set[String]` default method (defaults 
to `Set(task)`) so a single codegen can register under multiple task strings; 
`ImageTaskCodegen` is the first multi-task codegen to use it.
   - `CodegenContext` extended with `imageInput` + `inputImageColumn` 
(`EncodableString`).
   - `HuggingFaceInferenceOpDesc.scala` gains 2 new `@JsonProperty` fields and 
registers `ImageTaskCodegen` via the new `tasks` flat-map.
   
   `PythonCodegenBase.scala` grows to host the shared image infrastructure:
   - Task-family tuples (`image_only_tasks`, `image_prompt_tasks`, 
`image_tasks`) + `image_headers` in `process_table`.
   - Per-row image-bytes resolution from upload or column with 
`_read_image_input` / `_read_binary_value` / `_compress_image_bytes`.
   - `_post_with_fallback` extended with `raw_binary_headers` + 
`use_raw_binary_body`; adds image-text-to-text chat-completions and 
model-author vision branches.
   - `_call_provider` gains zai-org, Replicate predictions + polling, Fal-ai, 
Wavespeed submit+poll branches, and image embedding for OpenAI-compatible / 
unknown-provider fallbacks.
   - Image content-type response handling returns `data:image/...;base64,...` 
URLs.
   - Image helpers added: `_read_image_input`, `_compress_image_bytes`, 
`_image_input_as_base64`, `_read_binary_value`, `_looks_like_html`, 
`_html_to_image_bytes`, `_extract_json_arg`, `_url_to_data_url`.
   
   Frontend integration (HF lines only — no agent / dataset noise):
   `HuggingFaceImageUploadComponent` declared in `app.module.ts`, 
`huggingface-image-upload` formly type registered, image upload component 
.ts/.html/.scss + `HuggingFace.png` + `sample-image.png` assets.
   
   User-input strings continue to flow through `pyb"..."` + `EncodableString` 
so they reach Python as `self.decode_python_template('<base64>')` rather than 
raw literals. `PythonCodeRawInvalidTextSpec` still passes
   (117/117 descriptors `py_compile` cleanly).
   
   ### Any related issues, documentation, or discussions?
   
   - Tracking issue: #5319 
   - Closes: #5319 
   - Stacked on: #5278 (operator + text-generation — issue #5277)
   - Parent issue: #5041
   - Closed sibling issue: #5134 (REST resource — landed via #5124)
   
   ### How was this PR tested?
   
   - `sbt "WorkflowOperator/compile; WorkflowOperator/Test/compile"` clean.
   - `sbt scalafmtCheck` clean.
   - `sbt "WorkflowOperator/testOnly 
org.apache.texera.amber.operator.huggingFace.HuggingFaceInferenceOpDescSpec"` — 
18/18 pass (PR 2's 13 spec tests + 5 new image-task tests: image-only routing, 
VQA / document-QA payload, image-text-to-text chat-completions, image-to-image 
data-URL parse, all-9-tasks dispatcher coverage).
   - `sbt "WorkflowOperator/testOnly 
org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"` — 117/117 
descriptors `py_compile` cleanly with the new operator code paths, no marker 
leaks.
   - Generated Python verified via `python3 -m py_compile` on sample image-task 
outputs.
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Yes, co-authored with Claude Opus 4.7.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] feat(huggingFace): add image task family via ImageTaskCodegen [texera]

Reply via email to