[Logica-l] Re: Demonstração por IA?

Joao Marcos Fri, 11 Jul 2025 05:58:50 -0700

Para contrabalançar a experiência positiva relatada na mensagem
anterior por um expert, acrescento dois causos pessoais:

(1) Neste semestre tive uma aluna que decidiu delegar para uma IA a
solução de exercícios relacionados a demonstrar ou refutar conjecturas
proposicionais, usando o sistema ProofWeb
(https://carol.dimap.ufrn.br/proofweb/) como interface para teorias em
Coq.  Para cada conjectura recebida, a aluna apresentou tanto uma
"demonstração" (usando códigos que ela não é capaz de explicar) quanto
uma "refutação" --- sim, ela demonstrou *e* refutou cada um dos
exercícios!  Olhando para o código apresentado pela discente, demo-nos
conta de que a maior parte das respostas apresentadas envolviam duas
táticas Coq: "abort" (isto é, "desisto") e "admit" (isto é, "assuma
como axioma").  Obviamente, as táticas usadas resultam
"bem-sucedidas"...  Como tem sido amplamente verificado, usuários
despreparados costumam tirar bem menos proveito da IA do que
gostariam.

(2) Outro aluno da mesma turma decidiu consultar um oráculo artificial
para "saber um pouco mais sobre as ferramentas computacionais que
estávamos usando".  Ou seja, ao invés de simplesmente nos perguntar a
respeito, em sala de aula ou no fórum, assistir aos vídeos
instrucionais que fizemos, e consultar os artigos que publicamos sobre
o assunto (por exemplo:
https://dimap.ufrn.br/~jmarcos/papers/JM/15-TM-Try.pdf), e ao invés de
simplesmente consultar o material do curso ou fazer uma consulta no
Google, o aluno optou por consultar uma IA qualquer, que respondeu com
fantasias e delírios, a partir das quais o aluno tirou suas próprias
conclusões e enviou seus "questionamentos" para o fórum da disciplina.
O pior: quando apontei ao aluno as incorreções flagrantes nas
alucinações que ele copiou-e-colou no fórum, ele se chateou comigo, e
insistiu que as "fontes" dele estavam corretas!

Que tipos de experiências recentes têm tido os colegas com o (mau) uso
destas ferramentas?

[]s, Joao Marcos

On Fri, Jul 11, 2025 at 9:55 AM Joao Marcos <[email protected]> wrote:
>
> A mensagem abaixo pode ser de interesse para membros desta lista.
>
> []s, JM
>
>
> ---------- Forwarded message ---------
> From: Lawrence Paulson
> Subject: [isabelle] DeepSeek experiment
>
> Hello all, you may be interested in a little experiment I tried using the
> DeepSeek LLM yesterday. I asked whether it could translate a theorem header
> from n-dimensional real vectors to abstract Euclidean spaces. My query:
>
> > The following theorem header fixes variable "a" to type "real^'n". Changing
> this type to "'a::euclidean_space" requires modifying the header, replacing $
> and χ by their analogues for Euclidean spaces.
> >
> > proposition
> >   fixes a :: "real^'n"
> >   assumes "m ≠ n" and ab_ne: "cbox a b ≠ {}" and an: "0 ≤ a$n"
> >   shows measurable_shear_interval: "(λx. χ i. if i = m then x$m + x$n else x
> $i) ` (cbox a b) ∈ lmeasurable"
> >    and measure_shear_interval: "measure lebesgue ((λx. χ i. if i = m then x
> $m + x$n else x$i) ` cbox a b)
> >                = measure lebesgue (cbox a b)"
>
> On its first attempt it produced something that seemed to be almost right but
> clunky and referred to something non-standard. Pushing it a little further, I
> got the following, which is far as I can tell is correct. (Isabelle accepts
> it.)
>
> > proposition
> >   fixes a :: "'a::euclidean_space"
> >   assumes "m ≠ n"      and "m ∈ Basis" and "n ∈ Basis"
> >     and ab_ne: "cbox a b ≠ {}"      and an: "0 ≤ a ∙ n"
> >   shows measurable_shear_interval:          "(λx. ∑i∈Basis. (if i = m then
> (x ∙ i) + (x ∙ n)                            else (x ∙ i)) *⇩R i) ` (cbox a b)
> ∈ lmeasurable"
> >    and measure_shear_interval:          "measure lebesgue ((λx. ∑i∈Basis.
> (if i = m then (x ∙ i) + (x ∙ n)                                          else
> (x ∙ i)) *⇩R i) ` cbox a b)
> >          = measure lebesgue (cbox a b)”
>
>
> Everything was accompanied by quantities of plausible sounding narrative, e.g.
> "this version is more abstract and elegant”. A sample:
>
> > This version is more type-safe because:
> >     • No need to worry about index bounds
> >     • The basis vectors are first-class entities
> >     • The notation more clearly reflects the mathematical intent
> > Note that in this formulation:
> >     • Basis is the canonical basis of the Euclidean space
> >     • ∙ is the inner product (giving the component in a particular basis
> direction)
> >     • *⇩R is the scalar multiplication
> >     • The sum reconstructs the vector from its components
> > Would you like me to explain any part of this formulation in more detail? Or
> would you like assistance with proving this version of the theorem?
>
> My impression is that it was grabbing material from a variety of places, in
> particular the AFP entry Ordinary_Differential_Equations. But the word
> "elegant" doesn't appear there and in fact appears in the entire distribution
> + AFP only 42 times.
>
> I recall doing the precise task above in the past and needing the better part
> of an hour to get it right. I didn't try to port the proof (which was long and
> complicated) and I would be surprised if it could get that right.
>
> I'm not sure how much more can be accomplished with the free version of
> DeepSeek. I have also seen Claude Sonnet 4 swallow entire Isabelle codebases
> and summarise them. These tools may already be useful for tasks such as
> identifying duplicate material.
>
> Larry

-- 
https://sites.google.com/site/sequiturquodlibet/

-- 
LOGICA-L
Lista acadêmica brasileira dos profissionais e estudantes da área de Lógica 
<[email protected]>
--- 
Você está recebendo esta mensagem porque se inscreveu no grupo "LOGICA-L" dos 
Grupos do Google.
Para cancelar inscrição nesse grupo e parar de receber e-mails dele, envie um 
e-mail para [email protected].
Para ver esta conversa, acesse 
https://groups.google.com/a/dimap.ufrn.br/d/msgid/logica-l/CAO6j_Li1s5EWgtjce8B%3D3ggJn4NxJ0a7S1md3YTz2KYOkFxXVA%40mail.gmail.com.

[Logica-l] Re: Demonstração por IA?

Responder a