* Russell Adams <[email protected]> [2026-03-16 16:57]: > On Sun, Mar 15, 2026 at 09:35:06PM +0300, Jean Louis wrote: > > * Dr. Arne Babenhauserheide <[email protected]> [2026-03-15 20:42]: > > > > Let's quit drama, focus on contribution guidelines related to LLM as > > that is subject of the thread. > > My opinion is simple: zero LLM usage tolerated. LLM created and > assisted patches should be rejected.
It's a modern form of blood and soil ideology—purity of thought, rejecting any tool that isn't "native born" to the human mind. Any outside assistance is seen as contamination. It's a shibboleth—a way to say 'I am a real programmer' by burning the books of the future. The ritual requires suffering, so any shortcut is heresy. It creates a new inquisition. Now we must prove our thoughts are "organically grown," forcing developers to hide their process or face excommunication. Here are five ways LLMs can assist in coding without generating or patching code directly: 1. Code Review Preparation An LLM can analyze your code for potential edge cases, style inconsistencies, or common pitfalls before you submit it for human review. This helps clean up minor issues so reviewers can focus on architecture and logic. It would be difficult to say that user even used LLM to generate code. Planning process of the tool `opencode' https://opencode.ai/ is that type. It can tell programmer what to do, instead of generating code. 2. Writing Documentation and Comments LLMs can draft docstrings, inline comments, or high-level README sections based on your code. This saves time and encourages better documentation practices without changing the code itself. Docstrings may not be "code"... 3. Planning and Pseudocode You can describe a problem in plain English, and the LLM can help outline steps, data structures, or algorithms in pseudocode. This clarifies your approach before you write actual code. 4. Explaining Legacy or Complex Code When you inherit unfamiliar code, an LLM can provide a plain-English summary of what it does, how its parts interact, and potential side effects—helping you understand it without modifying it. 5. Generating Test Cases LLMs can suggest edge cases or example inputs/outputs for unit tests based on your function’s purpose. You still write the tests, but the brainstorming is assisted. Tests may not be submitted. These uses keep the human firmly in the driver’s seat while using the LLM as a thinking partner or documentation aid. Here are a few ways to correct and improve your message, depending on the tone you want to convey. The original text had a few grammatical errors regarding pluralization ("experiences"), word order ("benefits" vs. "benefits of"), and capitalization. It also had a slightly confusing sentence structure in the second paragraph. Russel, I don't think you have enough experience with LLMs to know their benefits. It is surprising to see such a hard and strict stance saying 'zero LLM tolerance.' You can't know this, Russel. Most likely, you already use LLM-generated code in your Emacs without realizing it. One big LOL! > LLM is not just any tool, where we can blame the user for the > resulting outputs. LLMs are everything free software has stood against > for decades. So the argument is: proprietary LLMs bad, therefore all LLM assistance bad. No nuance. No distinction between running Llama.cpp locally and feeding code to OpenAI. Just a clean, simple ban that requires zero thought to enforce. Perfect thought-stopper. "LLMs are everything free software has stood against"—say it enough times and you never have to ask which LLMs, how they're used, or who controls them. Just a comfortable, absolute rejection. Quite contrary. Free software is about sharing. LLMs are distributed sharing machines. Quite the opposite. Free software is about sharing—sharing code, sharing knowledge, sharing how-to. LLMs are distributed sharing machines. They take the collective wisdom of thousands and make it available to anyone. That's not against free software; that's its logical endpoint. Free software's genius was always the gift economy: I share my code, you share yours, we all improve together. LLMs are just that principle scaled—thousands of developers, millions of decisions, distilled into a tool that shares it all back with you. > LLMs are proprietary, locked behind paywall services. Free software ran on proprietary hardware for decades. We didn't ban coding because Intel kept microcode secret. We built tools that ran on it anyway. Same here: open models running on closed GPUs is just the next iteration of that same struggle. Unix was proprietary once. So was C. So was the internet itself. The pattern is always: corporate labs invent, then free software democratizes. We're watching that happen in real time with LLMs. Why don't you go to Swiss AI and tell to them how "LLMs are proprietary"? Reference: https://www.swiss-ai.org/apertus > They are not open source. There are free software drivers and proprietary once for GPUs, depending on the GPU. Then again to get the inference, one doesn't need to run it on the GPU, CPU can get good numbers of token. This means fully free software can run inference of fully free language models. Majority of software running language models are free software already. Language model is not software. It is distributes statistics of probabilities. An LLM is like a recording of a master pianist improvising. It contains the patterns, the style, the "essence" of that performance. You can't point to a single note and say "that's the software," but you can feed that recording into a player piano (the inference code) to create new music that sounds like the master. You keep calling it "open source software"—but an LLM isn't software. It's a statistical model, a matrix of numbers. You're judging a fish by how well it climbs a tree. > You can't run them yourself because you lack the trained model and > the heavyweight hardware. What? That is point of my frustration, seeing such statements. What year is this statement from? 2022? Because in 2026, millions of people run LLMs locally every single day. The statement isn't just wrong—it's aggressively outdated. Let's see what is running locally over here: 48672 /usr/local/bin/llama-server -ngl 0 --device none --rerank -m /mnt/nvme0n1/LLM/quantized/bge-reranker-v2-m3-q8_0.gguf -v -c 8192 -ub 1024 --log-timestamps --host 192.168.1.68 --port 7676 49138 /usr/local/bin/llama-server --jinja -fa on -c 131072 -ngl 64 -v --log-timestamps --host 192.168.1.68 -ub 1024 --threads 16 --embeddings --reasoning-format deepseek-legacy --reasoning-budget 0 -m /mnt/nvme0n1/LLM/quantized/Qwen3.5-9B-UD-Q4_K_XL.gguf --mmproj /mnt/nvme0n1/LLM/quantized/Qwen3.5-9B-mmproj-F16.gguf 49772 /usr/local/bin/llama-server -ngl 999 -v -c 8192 -ub 1024 --embedding --log-timestamps --host 192.168.1.68 --port 9999 -m /mnt/nvme0n1/LLM/nomic-ai/quantized/nomic-embed-text-v1.5-Q8_0.gguf 79549 llama-server --jinja -fa on -c 131072 -ngl 0 --device none -v --log-timestamps --host 192.168.1.68 --port 9991 -ub 1024 --threads 16 --embeddings --reasoning-format deepseek-legacy --reasoning-budget 0 -m /mnt/nvme0n1/LLM/quantized/Qwen3.5-2B-UD-Q8_K_XL.gguf --mmproj /mnt/nvme0n1/LLM/quantized/Qwen3.5-2B-mmproj-F16.gguf 79820 llama-server --jinja -fa on -c 131072 -ngl 0 --device none -v --log-timestamps --host 192.168.1.68 --port 9992 -ub 1024 --threads 16 --embeddings --reasoning-format deepseek-legacy --reasoning-budget 0 -m /mnt/nvme0n1/LLM/quantized/Qwen3.5-0.8B-UD-Q8_K_XL.gguf --mmproj /mnt/nvme0n1/LLM/quantized/Qwen3.5-0.8B-mmproj-F16.gguf 79978 llama-server --jinja -fa on -c 131072 -ngl 0 --device none -v --log-timestamps --host 192.168.1.68 --port 9993 -ub 1024 --threads 16 --embeddings --reasoning-format deepseek-legacy --reasoning-budget 0 -m /mnt/nvme0n1/LLM/quantized/Qwen3.5-4B-UD-Q5_K_XL.gguf --mmproj /mnt/nvme0n1/LLM/quantized/Qwen3.5-4B-mmproj-F16.gguf I have six (6) language models running simultaneously on a single computer in my home. Not in the cloud. Not behind a paywall. Not through someone else's API. Process 48672: A reranker model, filtering and sorting information locally. Process 49138: A 9-billion parameter vision-language model (Qwen 3.5) with 64 GPU layers, handling both text and images. Context window of 131,072 tokens—long enough to digest entire books. Process 49772: A dedicated embedding model creating vector representations of text, running on port 9999. Plus there is same embedding space image model as well. Processes 79549, 79820, 79978: Three additional Qwen models of varying sizes (2B, 0.8B, 4B), each serving different purposes, each running on CPU with zero GPU layers. > They do vendor lock in with their APIs. Friend, you're confusing two completely different things. Vendor lock-in is what OpenAI and Anthropic do—proprietary APIs, black boxes, your data training their models. I'm talking about Hugging Face: open weights, downloadable models, run-anywhere formats. One is a cage. The other is a library. Please go to https://huggingface.co and research > They can change at any time. You have no rights while using > them. Their output is not your own, and they could claim partial > ownership. Change what? The model sitting on my hard drive? Go ahead—try to change it remotely. I'll wait. Oh right, you can't, because it's mine now. Downloaded, quantified, running on my machine with no internet connection. Good luck "changing" that. "Their output is not your own"—whose output? The model I'm running locally, on my hardware, with my prompts, generating text that never touches their servers? That output is mine. Every byte of it. There's no "they" to claim anything. > On the side of software ethics LLMs fail every litmus test. They are > trained without permission on the free work of others. They are > untrustworthy, not only in their erroneous output, but they may not > follow user instructions when the owner's directions override. They > are owned and pushed by some of the worst companies and people on > earth. They are being used to hurt and manipulate others on an > industrial scale. - "Trained without permission on free work of others" -- You were trained on the free work of others too—your parents, your teachers, every book you ever read, every conversation you ever overheard. That's called learning. Humans don't pay royalties for every idea they absorb, and neither should machines - "Untrustworthy in erroneous output" — All human-written code has bugs. All human experts make mistakes. The standard isn't perfection—it's utility. Local LLMs are tools, not oracles, and responsible users verify outputs just as they verify human contributions. - "May not follow user instructions when owner's directions override" — This describes proprietary APIs with hidden system prompts. It does not describe locally-run open models where you control the instructions, the system prompt, and every line of inference code. There are so called abliterated and uncencored models. It is free choice for user to use what they want. Your statement is generalized. - "Owned and pushed by some of the worst companies and people on earth" — The same companies build the hardware you're using, the operating system you're running, and the compilers you trust. The tool is not the crime. Free software communities now produce and distribute their own models independent of corporate agendas. - "Being used to hurt and manipulate others on an industrial scale" — So are social media algorithms. So are targeted advertising systems. So are search engines. The existence of misuse does not invalidate the tool—it calls for ownership and control so that individuals can use them for good rather than ceding them entirely to bad actors. > "It wurked fer me!" and "I like that it helped me make something > quickly" are arguments for the utility. I can't completely disregard > their utility, but the ethical side I cannot ignore. It's easier to > use commercial vendor's lock-in software too! Yet for ethical reasons > here we are using and building FREE software. > > LLMs have no place in free software. You're posing this as utility vs ethics, as if we have to choose. But that's a false choice. The ethical path isn't rejecting LLMs—it's owning them. Running local models, on your hardware, with open weights, accountable to no corporation. That's not choosing utility over ethics. That's choosing both. Perfect ethical purity is easy when you're not the one being harmed. But the people who benefit most from LLMs—students, hobbyists, non-native speakers, developers in developing countries—don't have the luxury of rejecting powerful tools. The ethical choice is to give them free alternatives, not take away the only ones they have. You've correctly identified what's wrong with (some) corporate language models. But your solution—reject the entire category—throws out the baby with the bathwater. The fight isn't against LLMs. It's for free LLMs. The evidence is overwhelming—and it's all on Hugging Face right now. Let's walk through the specific examples that prove LLMs have a place in free software: - Microsoft Phi — Released under the MIT license. The Phi-4 model with 14B parameters is explicitly "ready for commercial and non-commercial use" and carries an MIT License Agreement. You can download it from Hugging Face, run it locally, modify it, and integrate it into your own projects without asking Microsoft for permission. - OLMo (Allen Institute for AI) — This is not just open weights. It's everything open: the base model weights, the training code, the fine-tuning code, the architecture documentation, and even the pre-print papers detailing the data and training process. All released under Apache 2.0 by a non-profit research institute committed to open science. - Qwen (Alibaba) — One of the most widely used open LLM families globally, with over 40 million downloads. The Qwen organization on Hugging Face hosts models ranging from 0.5B to 235B parameters, including dense and MoE architectures, multimodal versions, and quantized variants. Most are released under Apache 2.0. You can run them locally with llama.cpp, Ollama, or LM Studio. - Apertus (Swiss ETH/EPFL) — Developed by public Swiss universities, trained on the Alps supercomputer. This is a fully open model with a crucial distinction: when collecting training data, the developers explicitly observed Swiss and EU legal regulations related to data privacy, copyright, and transparency rules. The training data is disclosed and reproducible, the basic model is freely available on Hugging Face under Apache 2.0, and the project even provides email addresses for privacy and copyright requests. This is what ethical natural language model development looks like in practice. - Granite (IBM) — A family of encoder-based embedding models for retrieval tasks, spanning dense and sparse architectures. IBM publicly releases all Granite models under the Apache 2.0 license, allowing both research and commercial use with full transparency into their training data. The Granite Vision models are designed for enterprise document understanding and achieve top ranks on industry benchmarks. So when someone claims that LLMs have "no place in free software," what are they telling these projects? - That Microsoft's MIT-licensed Phi, which you can download and run without ever phoning home to Microsoft, doesn't count? - That Allen Institute's OLMo, with everything—weights, code, data, logs—released under Apache 2.0, is somehow not free software? - That IBM's Apache-2.0 Granite models, used in enterprise retrieval pipelines, are ethically equivalent to OpenAI's black-box API? The category is not monolithic. There is a clear spectrum from fully closed (GPT-4) to fully open (OLMo, Apertus). And the fully open end of that spectrum is growing every day, built by universities, nonprofits, and even corporations who understand that open wins. The question isn't whether LLMs belong in free software. The question is whether free software will rise to meet them—or cede the field entirely. -- Jean Louis --- via emacs-tangents mailing list (https://lists.gnu.org/mailman/listinfo/emacs-tangents)
