The GitHub Actions job "Required Checks" on 
texera.git/gh-readonly-queue/main/pr-5768-0eb7baa6e5bb9b4b27cb6e0d7d03704e0c3a6786
 has succeeded.
Run started by GitHub user mengw15 (triggered by mengw15).

Head commit for run:
eba14a67f15837ab4803d656e5948c28357dc9c6 / Xinyuan Lin <[email protected]>
test(workflow-operator): add unit test coverage for SklearnAdvanced trainer 
descriptors (#5768)

### What changes were proposed in this PR?

Pin behavior of four previously-uncovered sklearn-trainer descriptors in
`common/workflow-operator/operator/machineLearning/sklearnAdvanced/`.
Each is a 30-line override of `SklearnMLOperatorDescriptor` that
contributes just two values: the Python `import` statement and the
operator-info label. Drift in either silently breaks generated Python
code or the UI label. No production-code changes.

| Spec | Source class | Tests |
| --- | --- | --- |
| `SklearnAdvancedKNNClassifierTrainerOpDescSpec` |
`SklearnAdvancedKNNClassifierTrainerOpDesc` | 5 |
| `SklearnAdvancedKNNRegressorTrainerOpDescSpec` |
`SklearnAdvancedKNNRegressorTrainerOpDesc` | 6 |
| `SklearnAdvancedSVCTrainerOpDescSpec` |
`SklearnAdvancedSVCTrainerOpDesc` | 5 |
| `SklearnAdvancedSVRTrainerOpDescSpec` |
`SklearnAdvancedSVRTrainerOpDesc` | 6 |

All four spec files follow the `<srcClassName>Spec.scala` one-to-one
convention.

**Behavior pinned (per descriptor)**

| Surface | Contract |
| --- | --- |
| `getImportStatements` | exact canonical Python import
(`KNeighborsClassifier` / `KNeighborsRegressor` / `SVC` / `SVR` from the
appropriate sklearn module) |
| `getOperatorInfo` | exact canonical label (`"KNN Classifier"` / `"KNN
Regressor"` / `"SVM Classifier"` / `"SVM Regressor"`) |
| Stability across two instances | both methods return the same string
regardless of which instance is queried |
| Type assignability | extends `SklearnMLOperatorDescriptor[ParamsT]`
(compile-time enforced through a typed `val` binding) |
| Type-pattern matching | `case _: SklearnMLOperatorDescriptor[_]`
matches a concrete instance |

The Regressor spec additionally cross-checks against the Classifier
sibling (and SVR vs SVC) — catches copy-paste regressions where one
subclass accidentally returned the other's strings.

### Any related issues, documentation, discussions?

Closes #5765.

### How was this PR tested?

Pure unit-test additions; verified locally with:

- `sbt "WorkflowOperator/testOnly
org.apache.texera.amber.operator.machineLearning.sklearnAdvanced.KNNTrainer.SklearnAdvancedKNNClassifierTrainerOpDescSpec
org.apache.texera.amber.operator.machineLearning.sklearnAdvanced.KNNTrainer.SklearnAdvancedKNNRegressorTrainerOpDescSpec
org.apache.texera.amber.operator.machineLearning.sklearnAdvanced.SVCTrainer.SklearnAdvancedSVCTrainerOpDescSpec
org.apache.texera.amber.operator.machineLearning.sklearnAdvanced.SVRTrainer.SklearnAdvancedSVRTrainerOpDescSpec"`
— 22 tests, all green
- `sbt scalafmtCheckAll` — clean
- CI to confirm

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7 [1M context])

Report URL: https://github.com/apache/texera/actions/runs/27855995101

With regards,
GitHub Actions via GitBox

Reply via email to