A good summary. At the level of the summary it is not specific to R. However,
asking a chatbot "I want a program to analyze my data" will not result in
anything useful. You have to phrase the question to let the chatbot know what
language, what does the data look like, and what do you want to do. So I might
write: I have a data frame with three variables where V1 is time. I want an R
program to calculate the mean of V3 for each unique value in V1. The answer
will be R specific. The chatbot may guess at missing elements including
generating fake data to act as an example. In "does the generated code work"
the quality of the end result depends on the quality of the question asked. The
chatbot works best with tightly focused simple questions. Complicated problems
require complicated prompts and chatbots often ignore parts of long prompts.
Developing good prompts is also a learned skill.
I would add
- is it worthwhile?
It takes time to write code. It also takes time to develop a good prompt. In
both cases there will be a code validation and debugging step. What is the most
efficient use of your time? What gives the best final product?
I might ask will you learn from it? The chatbot gave you an answer that you
could not write yourself when you asked the chatbot. Given the chatbot answer
do you understand it and could you now get that answer yourself?
The chatbot wrote: if(!require(tidyverse){install.packages('tidyverse')}
And explained the solution. I understand, and this helps to avoid
installing packages that are already installed. You can take it further and
say: I have a program that requires 30 packages. In R, what is the shortest
code that will check to see if they are already installed and to install ones
that I do not have?
I get several answers, one of which is:
pkgs <- c("dplyr", "ggplot2", "lme4")
to_install <- pkgs[!pkgs %in% rownames(installed.packages())]
if (length(to_install)) install.packages(to_install)
I can then copy the code and ask the chatbot to explain the code, and progress
from there.
If the chatbot is generating black boxes where "a miracle occurs" they will in
the end cause you more problems than they solve. If all the code is generated
by a chatbot and you do not understand it, what will you do when the boss stops
by as asks for a modification or enhancement or a customer stops by and states
that your code generates errors?
-----Original Message-----
From: R-help <[email protected]> On Behalf Of Richard O'Keefe
Sent: Tuesday, December 9, 2025 8:02 PM
To: Gregg Powell <[email protected]>
Cc: R help project <[email protected]>; Hans W <[email protected]>;
Robert Knight <[email protected]>
Subject: Re: [R] Chatbot -generated R Code
[External Email]
So to summarise, there are three key issues so dar:
- does the generated code work
- does it infringe on someone else’s intellectual property rights
- do the AI’s terms of service permit you to use it
What are some other things people who want to use an AI to generate code should
consider? Ither than the application domain,?is any of this specific to R?
On Wed, 10 Dec 2025 at 8:30 AM, Gregg Powell via R-help < [email protected]>
wrote:
> I did not say blindly trust LLMs nor did I recommend their use. That
> is up to each individual.
> Those who choose not to use LLMs will not be competitive against their
> peers who do - that is my claim.
>
> As for me, I use LLMs. I have no axe to grind against using LLMs or
> those who use them. Honestly, at 58 - I did not think I'd see AI in my
> lifetime. I see LLMs as a tool. A very useful tool. I would not want
> to be a younger person having to compete against AI. I am glad to be
> in a position where AI and its impact on society will have little or
> no financial impact on me personally. I commiserate with those not in
> a similar circumstance. I see many taking a supercilious attitude
> toward those who use AI (as demonstrated in your emails, for instance)
> - particularly among coders. Ironically, coders are among the first
> and hardest hit by AI, along with graphic designers, writers,
> researchers, data scientists... there is a long and growing list.
>
> The genie is out of the bottle. Governments are run by people either
> too greedy or power hungry to curtail the technology. It is the start
> of a new arms race. Some claim it will help society, other claim it will
> destroy it.
> As most things usually go - the truth most probably lies somewhere in
> the middle. Only time will tell.
>
> All the best!
> Gregg
>
>
>
>
>
>
> On Tuesday, December 9th, 2025 at 10:06 AM, Robert Knight <
> [email protected]> wrote:
>
> > Responding with LLM output to a question about risk and the legality
> > of
> something is not comforting. Naked Capitalism reported on
> hallucinations are increasing, not decreasing in language models.
> > I shall trust my own brain over an LLM output. Are you really
> > suggesting
> that people trust an LLM counterview of the meaning contracts they sign?
> >
>
> > This kind of thinking, and that guy who did not understand central
> tgeblaw of large numbers , both experts in the field, is why people
> like me have to work in other occupations and argue in the public
> sphere until someone like Kennedy can get into place.
> >
>
> > An LLM tells me not believe my lying eyes and cognitive
> > understanding of
> the contract I am about to sign.... Trust the LLM you say.
> >
>
> > My word.
> >
>
> > On Tuesday, December 9, 2025, Gregg Powell
> > <[email protected]>
> wrote:
> >
>
> > > Let's let Claude respond back itself:
> > >
>
> > > r/
> > > Gregg
> > > On Tuesday, December 9th, 2025 at 8:39 AM, Robert Knight <
> [email protected]> wrote:
> > >
>
> > > > It seems like malpractice to recommend Claude to someone using R
> > > > or
> big data since what they would use it for is *explicitly* against the
> terms of service. Machine learning predates the microchip.
> > > > See below.
> > > >
>
> > > > Also, quality control will make a comeback. Expert systems
> > > > cannot be
> replaced with something akin to Bayes probability charts indedinitely.
> > > >
>
> > > >
>
> > > > > you may not use the service to “develop any products or
> > > > > services
> that compete with our Services, including to develop or train any
> artificial intelligence or machine learning algorithms or models.”
> > > >
>
> > > > Claude’s terms further state
> > > >
>
> > > > > “Equitable relief. You agree that (a) no adequate remedy
> > > > > exists at
> law if you breach Section 3 (Use of Our Services); (b) it would be
> difficult to determine the damages resulting from such breach, and any
> such breach would cause irreparable harm; and (c) a grant of
> injunctive relief provides the best remedy for any such breach. You
> waive any opposition to such injunctive relief, as well as any demand
> that we prove actual damage or post a bond or other security in connection
> with such injunctive relief.”
> > > >
>
> > > > Machine learning includes linear regression. Other Machine
> > > > Learning
> algorithms include Logistic Regression, decision trees, random
> forests, support vector machines, K-Nearest Neighbors, & Bayes
> Algorithms. It seems to me, that as of 14 October 2024, no one seeking
> to handle any data science can legitimately use Claude
> > > >
>
> > > >
>
> > > >
>
> > > > On Tuesday, December 9, 2025, Gregg Powell via R-help <
> [email protected]> wrote:
> > > >
>
> > > > > Humans who don't adapt to LLMs, or whatever form AI takes as
> > > > > it
> evolves, will be left in the dust.
> > > > >
>
> > > > > People may just now be waking up to the fact that we're three
> years into a tremendous revolution, one of the greatest in human history.
> It follows the Bronze Age, the Iron Age, the Industrial Revolution,
> the computer revolution, the Information Age, and now... AI.
> > > > >
>
> > > > > AGI is approaching. How quickly? Who can say. Whether AI can
> > > > > ever
> be truly sentient remains a mystery. But once it can adequately
> replicate sentience, some will ask: what's the difference?
> > > > >
>
> > > > > As to the question of who judges what's acceptable from a
> > > > > coding
> standpoint: capitalism will. Corporations will. And the question of
> whether this is the future of coding is already behind us. It is
> coding now, and it will only continue to improve in capability.
> > > > >
>
> > > > > Try Replit, Cursor, Claude Code. Humans are incapable of
> > > > > keeping
> up. AI still struggles with some of the most complex tasks, and it
> does poorly at orchestrating across large repositories, but it's
> improving rapidly.
> > > > > Just my observations.
> > > > >
>
> > > > >
>
> > > > > Those who look down their noses at all this will be left behind.
> > > > >
>
> > > > > All the best!
> > > > > Gregg
> > > > >
>
> > > > >
>
> > > > >
>
> > > > >
>
> > > > > On Tuesday, December 9th, 2025 at 6:32 AM, Hans W <
> [email protected]> wrote:
> > > > >
>
> > > > > >
> > > > >
>
> > > > > >
> > > > >
>
> > > > > > SORRY if I missed such a discussion somewhere on R-HELP
> > > > > >
> > > > >
>
> > > > > > For many years I wanted to write an R function that finds
> > > > > > the
> closest pair of
> > > > > > points among a, maybe huge, set of points on the
> > > > > > 2-dimensional
> plane. I never
> > > > > > did, perhaps considering the possible complexity of this task.
> > > > > >
> > > > >
>
> > > > > > Now I found a book, among others describing the "sweeping
> algorithm", perfectly
> > > > > > suited for the problem. And as a test, I questioned chatbots
> like DeepSeek and
> > > > > > ChatGPT about such a function - and mentioned the sweeping
> algorithm.
> > > > > >
> > > > >
>
> > > > > > DeepSeek, for instance, came immediately up with a complete,
> efficient solution
> > > > > > and test cases that I checked with brute force. I can see
> > > > > > that
> it utilized the
> > > > > > sweeping algorithm, documented the code, and set up a help file.
> I made some
> > > > > > changes, improved the code a bit, but still it is code
> > > > > > generated
> by a clever
> > > > > > chatbot, whatever I do.
> > > > > >
> > > > >
>
> > > > > > Now I ask myself: Is this a correct and lawful way to write
> > > > > > code
> in the future?
> > > > > > I am not even sure DeepSeek may not have used an
> > > > > > implementation
> of the sweeping
> > > > > > algorithm that is under ACM license and would not be allowed
> > > > > > on
> CRAN.
> > > > > >
> > > > >
>
> > > > > > I wonder how one handles this matter? Will this be the
> > > > > > future of
> code writing
> > > > > > (for R and other languages)? I would really appreciate to
> > > > > > hear
> your opinion or
> > > > > > a hint to a discussion about it.
> > > > > >
> > > > >
>
> > > > > > Hans Werner
> > > > > >
> > > > >
>
> > > > > > ______________________________________________
> > > > > > [email protected] mailing list -- To UNSUBSCRIBE and
> > > > > > more,
> see
> > > > > > https://nam10.safelinks.protection.outlook.com/?url=https%3A
> > > > > > %2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C0
> > > > > > 2%7Ctebert%40ufl.edu%7Cb8c767ee485143f501bd08de3787b5dc%7C0d
> > > > > > 4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C639009253200707097%
> > > > > > 7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAu
> > > > > > MDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0
> > > > > > %7C%7C%7C&sdata=7hzmk87fHUB6PI9h%2FHZ5%2F0OvsI0CNob%2FMtr6Ee
> > > > > > BYun4%3D&reserved=0 PLEASE do read the posting guide
> https://www/.
> r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7Cb
> 8c767ee485143f501bd08de3787b5dc%7C0d4da0f84a314d76ace60a62331e1b84%7C0
> %7C0%7C639009253200740416%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy
> dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%
> 3D%7C0%7C%7C%7C&sdata=%2F5mjhFyXvzFKMjpm119HpfTW4fFRf6eOAx96YftRUgI%3D
> &reserved=0
> > > > > > and provide commented, minimal, self-contained, reproducible
> code.______________________________________________
> [email protected] mailing list -- To UNSUBSCRIBE and more, see
> https://stat/
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu
> %7Cb8c767ee485143f501bd08de3787b5dc%7C0d4da0f84a314d76ace60a62331e1b84
> %7C0%7C0%7C639009253200758344%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGki
> OnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ
> %3D%3D%7C0%7C%7C%7C&sdata=o3BKtTG6MHwZrISq9qH4rPfQYfa8vZkJYTR3Qwybk6E%
> 3D&reserved=0
> PLEASE do read the posting guide
> https://www/.
> r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7Cb
> 8c767ee485143f501bd08de3787b5dc%7C0d4da0f84a314d76ace60a62331e1b84%7C0
> %7C0%7C639009253200774418%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy
> dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%
> 3D%7C0%7C%7C%7C&sdata=EDhOMbeiEjYoybomGrcWp0J13VXnBWul7jaK%2F6Eg0aw%3D
> &reserved=0 and provide commented, minimal, self-contained,
> reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.