Re: Project-wide LLM budget for helping people (was: Re: Complete and unified documentation for new maintainers

Lucas Nussbaum Fri, 06 Jun 2025 03:04:17 -0700

On 05/06/25 at 11:38 +0000, Holger Levsen wrote:
> On Wed, Jun 04, 2025 at 05:04:34PM +0200, Lucas Nussbaum wrote:
> > OpenAI has an Open Source fund. Maybe Debian should apply[1] for a grant
> > so that Debian contributors could get hands-on experience on how this
> > could help their Debian activities?
>  
> or maybe Debian should not.


Maybe. Honestly, I don't know.

Based on my limited experiments on this (everyone, feel free to correct
me), in the specific context of AI-assisted Debian development, the
question is threefold. As a Debian contributor, you would needed:

(A) an "agent", that is the client software running on your machine that
you talk with, and that interacts with your codebase (read files, make
changes to files, run commands, create git commits, etc.). Ideally in
some kind of sandbox and/or with permissions management.
Examples include 'Claude Code' (works in CLI, proprietary, interacts
only with Anthropic models), Cursor(.com) (VS Code fork, proprietary,
interacts with either Claude* (Anthropic) or Gemini* (Google)), Codex
CLI (free software, developed by OpenAI and focused on their models, but
supposed to work with other providers). DebGPT fits here too (but is
less advanced for coding tasks than Claude Code or Cursor).

(B) a way to query models:
- either subscription-based from commercial services such as OpenAI,
  Anthropic, Gemini, ... or brokers like OpenRouter.
- or using "open" models that you can download and run yourself,
  typically with ollama (free software)

(C) good matching between (A) and (B): it helps if the client side (A)
knows how to tune to queries ("prompt engineering") for the specific
provider/model in use (B). Typically, in my tests, trying to use Codex
CLI with ollama fails there (or I could not find a model that produced
reasonable results). (Also the OpenAI API has variants that are not
supported by all models ("tools support").)


As Debian, we should probably care if (A) is free software, and
ultimately package those in Debian. Currently the ecosystem is not quite
there yet, but that doesn't sound like a super-hard problem (there are
attempts at free alternatives and no real blocker). Those could go in
Debian main: it's not very different from packaging an OpenAI API client
(python-openai) or a GitHub client.

For Debian, the real question is about (B). Should we just accept that
the best services are the commercial ones currently, and if they can
help us make Debian better, embrace them? After all, we:
- use commercial CDNs to distribute packages
- rely on cloud providers for hosting various services
- fix issues detect by tools that are not free software
(I don't have an answer for that question, but I think that it's useful
for us to investigate how AI-assisted coding could help us)


Note that I don't buy the 'AI produces crap anyway' argument: after
experimenting with Claude Code and Cursor, I'm fully convinced that they
can help with some of the tasks involved in Debian development (and
their ability to help us will probably only increase over time).

Also, I don't think that the question is "should we trust AI-generated
code" at this point (and if it was asked at this point, the answer
should be a clear "no" in our context). We should continue to require
that contributions are linked to a developer who made reasonable
efforts to ensure that no errors were made, and are able to act
responsibly if that fails somehow.

Lucas

Re: Project-wide LLM budget for helping people (was: Re: Complete and unified documentation for new maintainers

Reply via email to