timsaucer opened a new pull request, #1505:
URL: https://github.com/apache/datafusion-python/pull/1505

   # Which issue does this PR close?
   
   Part of #1394 (implements PR 4 of the plan).
   
   # Rationale for this change
   
   #1394 tracks making datafusion-python legible to AI coding assistants 
without breaking the experience for humans browsing the docs. Earlier PRs 
shipped the repo-root `SKILL.md` (#1497), enriched module docstrings and 
doctests (#1498), added a README section pointing agents at the skill (#1503), 
and rewrote the TPC-H examples in idiomatic DataFrame form (#1504). This PR 
fills in the docs-site layer: a machine-readable entry point for LLM tooling, a 
short human-written page explaining how to wire up an AI assistant, and two 
contributor-facing skills that agents working on this repo can pick up.
   
   It also relocates the pattern demos that #1504 removed from the TPC-H 
queries (CASE filtering, array-based membership, UDF-vs-expression predicates, 
`array_agg` with filter) into the common-operations docs, so those teaching 
examples still live somewhere concrete.
   
   # What changes are included in this PR?
   
   - `docs/source/llms.txt` — an [llmstxt.org](https://llmstxt.org) entry 
point, copied verbatim to the site root via `html_extra_path`. Categorized 
links to the skill, user guide, DataFrame API reference, and TPC-H examples.
   - `docs/source/ai-coding-assistants.rst` — a short human-written page 
mirroring the README section added in #1503. Explains what the skill is, how to 
install it (`npx skills add apache/datafusion-python` or a manual `AGENTS.md` / 
`CLAUDE.md` pointer), and what it covers. Wired into the User Guide toctree.
   - `.ai/skills/write-dataframe-code/SKILL.md` — a contributor skill layered 
on top of the repo-root `SKILL.md`. Adds a TPC-H pattern index (which query 
demonstrates which API), the plan-comparison diagnostic workflow for 
translating SQL to DataFrame form, and the project-specific docstring 
conventions.
   - `.ai/skills/audit-skill-md/SKILL.md` — a contributor skill that 
cross-references `SKILL.md` against the current public Python surface 
(functions module, `DataFrame`, `Expr`, `SessionContext`, package-root 
re-exports) and reports new APIs needing coverage and stale mentions. 
Diff-only; does not auto-edit.
   - `AGENTS.md` (symlinked as `CLAUDE.md`) — lists the three contributor 
skills and documents the plan-comparison diagnostic workflow.
   - `docs/source/user-guide/common-operations/expressions.rst` — adds a 
"Testing membership in a list" section comparing `|`-compound filters, 
`in_list`, and `array_position` / `make_array`, plus a "Conditional 
expressions" section contrasting switched and searched `case`.
   - `docs/source/user-guide/common-operations/udf-and-udfa.rst` — adds a "When 
not to use a UDF" subsection showing the compound-OR predicate that replaces a 
Python-side UDF for disjunctive bucket filters (the Q19 case).
   - `docs/source/user-guide/common-operations/aggregations.rst` — adds a 
"Building per-group arrays" subsection covering `array_agg(filter=..., 
distinct=True)` with `array_length` and `array_element` for the 
single-value-per-group pattern (the Q21 case).
   - `examples/array-operations.py` — a runnable end-to-end walkthrough of the 
membership and `array_agg` patterns. Linked from `examples/README.md`.
   
   Verified with `pre-commit run --all-files` and `sphinx-build -W 
--keep-going` against the full docs tree.
   
   # Are there any user-facing changes?
   
   Yes, docs-only:
   
   - New docs-site page: `ai-coding-assistants.html`, reachable from the User 
Guide sidebar.
   - New docs-site asset: `llms.txt` served at the site root 
(`datafusion.apache.org/python/llms.txt`).
   - New common-operations content (membership tests, conditional expressions, 
UDF guidance, `array_agg` patterns).
   - New example file `examples/array-operations.py`.
   
   No public Python API is added, changed, or removed.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to