On Wed, 2025-03-26 at 08:50 +0000, Gianfranco Costamagna wrote: > > > And another bonus question. How do you feel about the general concept of > > free software going forward? Is it something that is growing / embraced > > by the world (big corporations, software companies, etc), or is the > > trend to nerf it and trend towards models such as open-core and exploit > > it as far as possible? > > it is growing, but we can do better. In my opinion we might want to start > thinking about > providing some cloud infrastructure, or even some AI based on our code. > Bonus point, let AI analyze the code to find potential bugs and report > upstream/developers if anything is found
At the current state commercial AIs handle high-complexity tasks better and there are always a bit of lag for open weight models. That means even if I implement the said feature in DebGPT, it is likely to perform best with payed LLM services. * Do you think Debian can use some budget to buy commercial AI quota to help developer at scale (given that we have figure out the exact use case of it)? If Debian isn't happy with commercial services, we still have a fall back of hosting GPU servers and run open weight LLMs, and expose the API to Debian members. However, based on years of communication with leader@ and the infra team, IIRC, hosting GPU servers as official Debian infra is not easy at all. If the GPUs were from nvidia, it clearly needs lots of packages from non-free and non-free-firmware sections to properly function. If the GPUs were from AMD, it can work with packages in main, but the effort of AMD ROCm has not reached the pytorch-rocm milestone. In contrast, I've made pytorch-cuda (contrib) usable quite a while ago. Anyway, I'm hosting a LLM inference library packaging project for GSoC 2025, and we will push forward to effort of making vLLM/SGLang usable on Debian out-of-the-box. I think that is one of the most promising use cases of the packages maintained by Debian Deep Learning Team. * If we were to host GPU servers to self-host LLMs, what is your opinion on the infra issue? > > And another because I'm supposed to be driving to work and it's more fun > > to type questions than to sit in traffic... how do you feel about how AI > > is going? Massive corporations are scraping and processing vast amounts > > of work in the commons that gets regurgitated as new and original code. > > Where do you stand on this, both ethically and in the context of the > > future of projects like Debian? > > We shouldn't loose the AI embracing opportunity. We can have some sort of > open source > AI, and we should pursue it, otherwise the risk will be of Debian being > kicked out from the market > even more. I feel the original question is more for Debian Deep Learning Team instead of DPL candidates. Anyway, we have a couple of lines of efforts towards "AI embracing opportunity": 1. DebGPT for exploring how LLM can help our work in any extent 2. ROCm (open-source) for a major competitor against nvidia's CUDA (proprietary) The milestone for this line of effort is pytorch-rocm. 3. Essential deep learning frameworks pytorch (cpu), pytorch-cuda (Nvidia GPU) are alreayd in shape. pytorch-rocm (AMD GPU) is still work in progress. Google's libraries like tensorflow and jax are extremely difficult to package due to the blockers on the java build system "bazel". 4. LLM inference libraries like vLLM/SGLang They depend on pytorch-cuda (contrib), which is already in archive. I'm hosting this part as GSoC this year. Let me shamelessly advertise the team here to attract potential contributors: https://wiki.debian.org/Teams/DebianAI > Also salsa might want to add some AI checks for patches sanity, or upload > sanity to help developers > not do usual mistakes (I'm staring at myself for adding patches and > forgetting to update series file) I have some similar planned features for DebGPT. However, to properly leverage AIs at scale, one major obstable you will encounter is public acceptance of this new technology. The use of LLMs are a little bit controvercial in the community due to unreliability of what they outputs. But I'd say as an "expert" in the area it is fully possible to "patiently" learn about the up and down sides of LLMs and learn to make proper use of them. Such "capability of making proper use of LLM while not getting trouble from LLM hallucination" is in my opinion one of the most important survival skills like driving in the next century, but so far it might only exist in domain experts and some active users. It is also a fact that a random user may immediately build a negative opinion once LLMs makes mistakes. If I were a DPL candidate I'll simply see efforts on "embracing AI" in Debian as specific user/developer/team experiments at the current stage, instead of escalating to the project scale. It needs time to get mature, and escalating it too early destine to fail due to the lack of promising use case and community recognition of how useful it is.

