Re: making good use of LLMs

Bruno Haible via Gnulib discussion list Tue, 27 May 2025 14:35:49 -0700

Paul Eggert wrote:
> > They are also not defined on AIX 7.3 (cfarm119.cfarm.net).
> 
> Thanks for fixing all that. I was lazily relying on Google searches and 
> Gemini, not the real thing.


Welcome to the new technologies :)

IMO there are two considerations when using a prior knowledge summarization
engine (a.k.a. LLM):

  * Distinguishing facts from hallucinations.

    ChatGPT's window has a footnote: 
    "ChatGPT can make mistakes. Check important info."
    Gemini's window has a footnote:
    "Gemini can make mistakes, including about people, so double-check it."

    In my experience, answers are reasonably reliable if they are general,
    namely if you can assume that more than 100 programmers have asked the
    same or similar questions in the past on the web.
    The more specific it gets, the more hallucination the answer will contain.

    For example, a question like
      In C, is there a way to determine whether the currently executing thread
      is inside a signal handler?
    or
      Does C++ have a typeof operator, like the C language?
    is generic enough, that the answer is likely correct.

    Whereas a question like
      Which operating systems have the C function `strchrnul` predefined?
    is so specific that it's nearly guaranteed that the LLM will produce
    incorrect information. And yes, this is so:
      - ChatGPT guesses wrong about the BSDs and macOS, but forgets about
        Solaris 11.
      - Gemini knows that it was added in FreeBSD 10 and NetBSD 8 (impressing!)
        and even macOS 15.4, but forgets about OpenBSD, Solaris 11, and Cygwin.

  * How to make use of generated/hallucinated code.

    There was a thread "copyrightability of "AI" output code" in g.p.d. in
    Nov-Dec 2024, and Eli Zaretskii summarized it in
    <https://lists.gnu.org/mailman/private/gnu-prog-discuss/2024q4/019735.html>
    (mail from 2024-12-03).
    The most important recommendation in there is:

     "We believe the conclusion is that if a GNU program is developed
      with "AI" assistance, the developers should keep records of all the
      prompts they fed to the "AI" to generate significant blocks of
      code, and indicate the code produced by each prompt.  Current IDEs
      seem to lack the ability to help the developers in this matter, so
      until they do, this will have to be done manually.  For example,
      the prompt could be saved with the program source file as
      specially-formatted comments, with separate comments indicating the
      beginning and end of the generated code.  Sophisticated editors
      such as Emacs make this a simple matter of copy/yank between
      buffers, but it is still manual work."

    Personally, I prefer to use the generated code only as an inspiration,
    so that all potential bugs are really mine :).

Bruno

Re: making good use of LLMs

Reply via email to