Christopher Dimech <[email protected]> writes:
>> > Then I can see extensive tests being done, flowing faster on screen
>> > then I can verify it, but I can see when there is success or failure.
>> 
>> So you never read the tests, you often just look at the outputs?
>> 
>> As a consequence: you don’t know what it actually tests (in the code)?
>> 
>> If so, you cannot know whether you actually have exhaustive tests. All
>> you know is that you have many functions that do something.
>
> He considers his tests exhaustive - sufficient to confirm code presence and 
> modifiability without needing functional details. If tests pass, he trusts 
> the design; if issues arise, he investigates. "Exhaustive" means the designer 
> confidently believes no experiential edge cases were missed.

> I find his approach satisfactory.

I was quite side-tracked by him saying that he generates test code.

If I understood him correctly right now (that it doesn’t actually matter
what that code is or that it is code at all), that makes it a bit easier
to grasp for me.

What he’s describing sounds like the manual testing needed in areas that
are hard to test well -- like many graphical user interfaces.

And the test code just serves to reduce the likelyhood that the LLM
changes behavior of the program: "when I saw this problem, this line
showed a failure, now the problem is gone and this line is a success, so
don’t change whatever this tested".

I don’t find his approach satisfactory, but I think by now I understand
a bit better why understanding was so hard: we’re speaking completely
different languages.

When he writes "extensive testing" he’s speaking from a QA perspective.

When he writes "test code", that’s not actually about the code, but
about limiting changes that the LLM is likely to do to stuff that likely
has no impact on the outputs he already checked.

I don’t find his approach satisfactory, but I do see that it matches how
many managers look at code: hoping that the engineers won’t just
bullshit stuff together, because they want to keep their jobs on the
long run, and requiring them to jump through hoops because the manager
had the experience that when the engineers report {insert buzzwords},
the rate of bugs is lower.

Which means that something about those hoops works, doesn’t matter what
it is.

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
https://www.draketo.de

Attachment: signature.asc
Description: PGP signature

---
via emacs-tangents mailing list 
(https://lists.gnu.org/mailman/listinfo/emacs-tangents)
            • ... Dr. Arne Babenhauserheide
              • ... Jean Louis
              • ... Dr. Arne Babenhauserheide
              • ... Jean Louis
              • ... Dr. Arne Babenhauserheide
              • ... Jean Louis
              • ... Dr. Arne Babenhauserheide
              • ... Jean Louis
              • ... Devin Prater
              • ... Christopher Dimech via Emacs news and miscellaneous discussions outside the scope of other Emacs mailing lists
              • ... Dr. Arne Babenhauserheide
              • ... Christopher Dimech via Emacs news and miscellaneous discussions outside the scope of other Emacs mailing lists
              • ... Jean Louis
              • ... Dr. Arne Babenhauserheide
              • ... tomas
              • ... tomas
              • ... Christopher Dimech via Emacs news and miscellaneous discussions outside the scope of other Emacs mailing lists
        • ... tomas
          • ... Jean Louis
  • ... Jean Louis
    • ... Christopher Dimech via Emacs news and miscellaneous discussions outside the scope of other Emacs mailing lists

Reply via email to