Re: [Rd] some questions about R internal SEXP types

Tomas Kalibera Tue, 08 Sep 2020 03:23:42 -0700


On 9/8/20 11:47 AM, Dan Kortschak wrote:

Thanks, Tomas.

This is unfortunate. Calling between Go and C is not cheap; the gc
implementation of the Go compiler (as opposed to gccgo) uses different
calling conventions from C and there are checks to ensure that Go
allocated memory pointers do not leak into C code. For this reason I
wanted to avoid these if at all possible (I cannot for allocations
since I don't want to keep tracking changes in how R implements its GC
and allocation).

However, if SEXP type behaviour of the standard types, and how
attributes are handled are not highly mobile, I think that what I'm
doing will be OK - at worst the Go code will panic and result in an R
error. The necessary interface to R for allocations is only eight
functions[1].

I am not sure if I understand correctly, but if you were accessingdirectly the memory of SEXPs from Go implementation instead of callingthrough exported access functions documented in WRE, that would be areally bad idea. Of course fine for research and experimentation, butthe internal structure can and does change at any time, otherwise wewould not be able to develop nor maintain R. Such direct accessbypassing WRE would likely be a clear case for rejection in CRAN forthis interface and any packages using it, and I hope in other packagerepositories as well.

However, I believe the overhead of calling the C-level access functionsR exports should be minimal compared to other overheads. You can't hope,anyway, for being able to efficiently call tiny functions frequentlybetween Go and R. This can only work for bigger functions, anyway, andthen the Go-C overhead should not be important.

Note that there is a lot in WRE that's beyond what I want rgo to be
able to do (calling in to R from Go for example). In fact, there's just
a lot in WRE (it's almost 3 times the length of the Go language spec
and memory model reference combined). The issues around weak references
and external pointers are not something that I want to deal with;
working with that kind of object is not idiomatic for Go (in fact
without using C.malloc, R external pointers from Go would be forbidden
by the Go runtime) and I would not expect that they are likely to be
used by people writing extensions for R in Go.

Sure, I think it is perfectly fine to cover only a subset, if that isalready useful to write some extensions in Go. Maintenance would beeasiest if Go programs didn't call back into the R runtime at all, sofewer calls the better for maintenance.


Best
Tomas


Dan

[1]


https://github.com/rgonomic/rgo/blob/2ce7717c85516bbfb94d0b5c7ef1d9749dd1f817/sexp/r_internal.go#L86-L118

On Tue, 2020-09-08 at 11:07 +0200, Tomas Kalibera wrote:

The general principle is that R packages are only allowed to use what
is
documented in the R help (? command) and in Writing R Extensions. The
former covers what is allowed from R code in extensions, the latter
mostly what is allowed from C code in extensions (with some
references
to Fortran).

If you are implementing a Go interface for writing R packages, such
Go
interface should thus only use what is in the R help and in Writing R
Extensions. Otherwise, packages would not be able to use such
interface.

What is described in R Internals is for understanding the internal
structure of R implementation itself, so for development of R itself,
it
could help indeed also debugging of R itself and in some cases
debugging
or performance analysis of extensions. R Internals can help in giving
an
intuition, but when people are implementing R itself, they also need
to
check the code. R Internals does not describe any interface for
external
code, if it states any constraints about say pairlists, etc, take it
as
an intuition for what has been intended and probably holds or held at
some level of abstraction, but you need to check the source code for
the
details, anyway (e.g., at some very low level CAR and CDR can be any
SEXP or R_NilValue, locally in some functions even C NULL).
Internally,
some C code uses C NULL SEXPs, but it is rare and local, and again,
only
the interface described in Writing R Extensions is for external use.

WRE speaks about "R NULL", "R NULL object" or "C NULL" in some cases
to
avoid confusion, e.g. for values types as "void *". SEXPs that
packages
obtain using the interface in WRE should not be C NULL, only R NULL
(R_NilValue). External pointers can become C NULL and this is
documented
in WRE 5.13.

Best
Tomas

On 9/6/20 3:44 AM, Dan Kortschak via R-devel wrote:

Hello,

I am writing an R/Go interoperability tool[1] that work similarly
to
Rcpp; the tool takes packages written in Go and performs the
necessary
Go type analysis to wrap the Go code with C and R shims that allow
the
Go code to then be called from R. The system is largely complete
(with
the exception of having a clean approach to handling generalised
attributes in the easy case[2] - the less hand holding case does
handle
these). Testing of some of the code is unfortunately lacking
because of
the difficulties of testing across environments.

To make the system flexible I have provided an (intentionally
incomplete) Go API into the R internals which allows reasonably Go
type-safe interaction with SEXP values (Go does not have unions, so
this is uglier than it might be otherwise and unions are faked with
Go
interface values). For efficiency reasons I've avoided using R
internal
calls where possible (accessors are done with Go code directly, but
allocations are done in R's C code to avoid having to duplicate the
garbage collection mechanics in Go with the obvious risks of error
and
possible behaviour skew in the future).

In doing this work I have some questions that I have not been able
to
find answers for in the R-ints doc or hadley/r-internals.

     1. In R-ints, the LISTSXP SEXP type CDR is said to hold
"usually"
        LISTSXP or NULL. What does the "usually" mean here? Is it
possible
        for the CDR to hold values other than LISTSXP or NULL, and
is
        this NULL NILSXP or C NULL? I assume that the CAR can hold
any type
        of SEXP, is this correct?
     2. The LANGSXP and DOTSXP types are lists, but the R-ints
comments on
        them do not say whether the CDR of one of these lists is the
same at
        the head of the list of devolves to a LISTSXP. Looking
through the
        code suggests to me that functions that allocate these two
types
        allocate a LISTSXP and then change only the head of the list
to be
        the LANGSXP or DOTSXP that's required, meaning that the tail
of the
        list is all LISTSXP. Is this correct?

The last question is more a question of interest in design
strategy,
and the answer may have been lost to time. In order to reduce the
need
to go through Go's interface assertions in a number of cases I have
decided to reinterpret R_NilValue to an untyped Go nil (this is
important for example in list traversal where the CDR can
(hopefully)
be only one of two types LISTSXP or NILSXP; in Go this would
require a
generalised SEXP return, but by doing this reinterpretation I can
return a *List pointer which may be nil, greatly simplifying the
code
and improving the performance). My question her is why a singleton
null
value was chosen to be represented as a fully allocated SEXP value
rather than just a C NULL. Also, whether C NULL is used to any
great
extent within the internal code. Note that the Go API provides a
mechanism to easily reconvert the nil's used back to a R_NilValue
when
returning from a Go function[3].

thanks
Dan Kortschak

[1]https://github.com/rgonomic/rgo
[2]https://github.com/rgonomic/rgo/issues/1
[3]

https://pkg.go.dev/github.com/rgonomic/rgo/sexp?tab=doc#Value.Export

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] some questions about R internal SEXP types

Reply via email to