On Wed, 24 Apr 2024, Hadley Wickham wrote:




That is not true at all - the presence of header does not constitute
declaration of something as the R API. There are cases where internal
functions are in the headers for historical or other reasons since the
headers are used both for the internal implementation and packages.
That's
why this is in R-exts under "The R API: entry points for C code":

If I understand your point correctly, does this mean that
Rf_allocVector() is not part of the "official" R API? It does not
appear to
be documented in the "The R API: entry points for C code" section.


It does, obviously:
https://cran.r-project.org/doc/manuals/R-exts.html#Allocating-storage-1


I'm just trying to understand the precise definition of the official API
here. So it's any function mentioned in R-exts, regardless of which
section
it appears in?

Does this sentence imply that all functions starting with alloc* are part
of the official API?


Again, I can only quote the R-exts (few lines below the previous "The R
API" quote):


We can classify the entry points as
API
Entry points which are documented in this manual and declared in an
installed header file. These can be used in distributed packages and will
only be changed after deprecation.


It says "in this manual" - I don't see anywhere restriction on a
particular section of the manual, so I really don't see why you would think
that allocation is not part on the API.


Because you mentioned that section explicitly earlier in the thread. This
obviously seems clear to you, but it's not at all clear to me and I suspect
many of the wider community. It's frustrating because we are trying
our best to do what y'all want us to do, but it feels like we keep getting
the rug pulled out from under us with very little notice, and then have to
spend a large amount of time figuring out workarounds.

Please try to keep this discussion non-adversarial.

That is at least
feasible for my team since we have multiple talented folks who are paid
full-time to work on R, but it's a huge struggle for most people who are
generally maintaining packages in their spare time.

As you well know, almost all R-core members are also trying to
maintain and improve R in their spare time. Good for folks to keep in
mind before demanding R-core do X, Y, or Z for you.

For the purposes of this discussion could you please "documented in the
manual" means? For example, this line mentions allocXxx functions: "There
are quite a few allocXxx functions defined in Rinternals.h—you may want to
explore them.". Does that imply that they are documented and free to use?

Where we are now in terms of what package authors can use to write R
extensions has evolved organically over many years. The current state
is certainly not ideal:

    There are entry points in installed headers that might be
    available;

    but to find out if they are in fact available requires reading
    prose text in the header files and in WRE.

Trying to fine-tune wording in WRE, or add a lot of additional entries
is not really a good or realistic way forward: WRE is both
documentation and tutorial and more legalistic language/more complete
coverage would make it less readable and still not guarantee
completeness or clarity.

We would be better off (in my view, not necessarily shared by others
in R-core) if we could get to a point where:

    all entry points listed in installed header files can be used in
    packages, at least with some caveats;

    the caveats are expressed in a standard way that is searchable,
    e.g. with a standardized comment syntax at the header file or
    individual declaration level.

In principle this is achievable, but getting there from where we are
now is a lot of work. There are some 500 entry points in the R shared
library that are in the installed headers but not mentioned in WRE.
These would need to be reviewed and adjusted. My guess is about a
third are fine and intended to be API-stable, another third are not
used in packages and don't need to be in public headers. The remainder
are things that may be used in current packages but really should not
be, for example because they expose internal data in ways that can
cause segfaults or they make it difficult to implement performance
improvements in the base engine. Sorting through these and working
with package authors to find alternate, safer options takes a lot of
time (see 'spare time' above) and energy (some package authors are
easier to work with than others). Several of us have taken cracks at
moving this forward from time to time, but it rarely gets to the top
of anyone's priority list.

And in general, I'd urge R Core to make an explicit list of functions that
you consider to be part of the exported API, and then grandfather in
packages that used those functions prior to learning that we weren't
supposed to.

Making a list and hoping that it will remain up to date is not
realistic.  The only way that would work reliably is if the list could
be programmatically generated, for example by parsing installed
headers for declarations and caveats as above. Which would be possible
with changes like the ones listed above.

Best,

luke


Hadley




--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
   Actuarial Science
241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu/
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to