zyratlo opened a new pull request, #5260:
URL: https://github.com/apache/texera/pull/5260

   ### What changes were proposed in this PR?
   Introduces the frontend LLM session class that converts a Jupyter notebook 
into a Texera workflow JSON plus a bidirectional cell to operator mapping, 
along with the prompt library it uses. Two files under 
`frontend/src/app/workspace/service/notebook-migration/`, totalling ~700 lines 
(~410 of which is prompt text).
   
   **`migration-llm.ts`** — defines `NotebookMigrationLLM`, an `@Injectable` 
class wrapping a Vercel AI SDK chat session against the LiteLLM proxy already 
exposed on `main` at `/api/chat/completion`.
     - `initialize(modelType, apiKey)` — builds an OpenAI-compatible chat 
client via `createOpenAI({ baseURL: AppSettings.getApiEndpoint() })`, seeds the 
message history with Texera documentation as `system` messages.
     - `verifyConnection()` — does a 10-token `ping` call to validate that the 
API key works against the configured model.
     - `convertNotebookToWorkflow(notebook)` — extracts code cells (each tagged 
with a UUID in `metadata.uuid`), sends `WORKFLOW_PROMPT` + the notebook to get 
a JSON of UDF operators / edges, then sends `MAPPING_PROMPT` to get the 
cell↔operator mapping. Assembles a complete Texera workflow JSON (`PythonUDFV2` 
operators with stub input/output ports, links derived from the LLM's edge list, 
default settings) plus a bidirectional `operator_to_cell` / `cell_to_operator` 
mapping. Returns both as a JSON string.
     - `close()` — clears the message history and the model reference.
   
   **`migration-prompts.ts`** — string constants used by `migration-llm.ts`: 
`TEXERA_OVERVIEW`, `TUPLE_DOCUMENTATION`, `TABLE_DOCUMENTATION`, 
`OPERATOR_DOCUMENTATION`, `UDF_INPUT_PORT_DOCUMENTATION`, 
`EXAMPLE_OF_GOOD_CONVERSION`, `VISUALIZER_DOCUMENTATION`, 
`EXAMPLE_OF_MULTIPLE_UDF_CONVERSION`, `WORKFLOW_PROMPT`, `MAPPING_PROMPT`.
   
   ### Any related issues, documentation, discussions?
   Closes #5259 
   Parent issue #4301 
   
   
   ### How was this PR tested?
   No unit tests were included for these reasons:
   - A large portion of the changes are prompt text, which are not testable, 
only readable. However the prompt text can be changed to improve the 
performance of the LLM.
   - Testing would require mocking a significant amount of logic that will be 
introduced in later PRs, since the logic in `migration-llm.ts` is parsing a 
response.
   
   However I am open to writing tests based on review feedback.
   
   
   ### Was this PR authored or co-authored using generative AI tooling?
   Generated-by: Claude Code (Claude Opus 4.7)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to