[Rd] undefined symbol errors when compiling package using ALTREP API

2019-06-04 Thread Mark Klik
Hello,

I'm developing a package (lazyvec) that makes full use of the ALTREP
framework (R >= 3.6.0).
One application of the package is to wrap existing ALTREP vectors in a new
ALTREP vector and pass all calls from R to the contained object. The
purpose of this is to provide a diagnostic framework for working with
ALTREP vectors and show information about internal calls.

The package builds on Windows and OSX but fails to build on Linux as can be
seen from the link to the Travis build:
https://travis-ci.org/fstpackage/lazyvec/jobs/539442806

The reason of build failure is that many ALTREP methods generate 'undefined
symbol' errors upon building the package (on Linux). I've checked the R
source code and the undefined symbols seems to be related to the
'attribute_hidden' before the function definition. For example, the method
'ALTVEC_EXTRACT_SUBSET' is defined as:

SEXP attribute_hidden ALTVEC_EXTRACT_SUBSET(SEXP x, SEXP indx, SEXP call)

My question is why these differences between Windows / OSX and Linux exist
and if they are intentional?
Do I need special build parameters to make sure my package builds correctly
on Linux?

thanks for all the hard work!

best,
Mark

PS: some additional info:

package github repository: https://github.com/fstpackage/lazyvec
AppVeyor package build logs:
https://ci.appveyor.com/project/fstpackage/lazyvec
Travis package build logs: https://travis-ci.org/fstpackage/lazyvec/builds

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] undefined symbol errors when compiling package using ALTREP API

2019-06-04 Thread Mark Klik
thanks for clearing that up, so these methods are actually not meant to be
exported on Windows and OSX?
Some of the ALTREP methods that now use 'attribute_hidden' would be very
useful to packages that aim to be ALTREP aware, should the currently
(exported) API be considered final?

thanks  for your time & best,
Mark

On Tue, Jun 4, 2019 at 6:52 PM Tierney, Luke  wrote:

> On Tue, 4 Jun 2019, Mark Klik wrote:
>
> > Hello,
> >
> > I'm developing a package (lazyvec) that makes full use of the ALTREP
> > framework (R >= 3.6.0).
> > One application of the package is to wrap existing ALTREP vectors in a
> new
> > ALTREP vector and pass all calls from R to the contained object. The
> > purpose of this is to provide a diagnostic framework for working with
> > ALTREP vectors and show information about internal calls.
> >
> > The package builds on Windows and OSX but fails to build on Linux as can
> be
> > seen from the link to the Travis build:
> > https://travis-ci.org/fstpackage/lazyvec/jobs/539442806
> >
> > The reason of build failure is that many ALTREP methods generate
> 'undefined
> > symbol' errors upon building the package (on Linux). I've checked the R
> > source code and the undefined symbols seems to be related to the
> > 'attribute_hidden' before the function definition. For example, the
> method
> > 'ALTVEC_EXTRACT_SUBSET' is defined as:
> >
> > SEXP attribute_hidden ALTVEC_EXTRACT_SUBSET(SEXP x, SEXP indx, SEXP call)
> >
> > My question is why these differences between Windows / OSX and Linux
> exist
> > and if they are intentional?
>
> It is intentional that this not be part of the public API. This is
> true of almost all functions with an ALTREP prefix. You need a
> different approach that avoids using these directly.
>
> Best,
>
> luke
>
> > Do I need special build parameters to make sure my package builds
> correctly
> > on Linux?
> >
> > thanks for all the hard work!
> >
> > best,
> > Mark
> >
> > PS: some additional info:
> >
> > package github repository: https://github.com/fstpackage/lazyvec
> > AppVeyor package build logs:
> > https://ci.appveyor.com/project/fstpackage/lazyvec
> > Travis package build logs:
> https://travis-ci.org/fstpackage/lazyvec/builds
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] undefined symbol errors when compiling package using ALTREP API

2019-06-04 Thread Mark Klik
Hi Gabriel,

thanks for your detailed explanation, that definitely clarifies the design
choices that were made in setting up the ALTREP framework and I can see how
those choices make sure existing code won't break.

My specific use-case for wanting to check whether a vector is an ALTREP is
the following: the fst package wraps an external C++ library (fstlib,
independent from R) that was made for high speed serialization of
dataframe's. Sequences are fairly common in dataframe's and I'm planning to
add the concept of a sequence to the (R-agnostic) fst format. When I can
detect, e.g. a 'compact_intseq' ALTREP vector and just retrieve it's 3
integer internal representation, serialization could be very fast.
Alternatively, as you describe, the vector needs to be expanded first
before serialization, which will actually be slower than using an already
expanded vector and can take a lot of RAM for large datasets.

So being able to make use of the internal representation of (a few of the)
base ALTREP vectors can be very interesting for (non-R) serialization
schemes.

thanks for your time!
Mark


On Tue, Jun 4, 2019 at 11:50 PM Gabriel Becker 
wrote:

> Hi Mark,
>
> So depending pretty strongly on what you mean by "ALTREP aware", packages
> aren't necessarily supposed to be ALTREP aware. What I mean by this is that
> as of right now, ALTREP objects are designed to be interacted with by
> non-ALTREP-implementing package code, *more-or-less *exactly as standard
> (non-AR) SEXPs are: via the published C API. The more or less comes from
> the fact that in some cases, doing things that are good ideas on standard
> SEXPS will work, but may not be a good idea for ALTREPs.
>
> The most "low-hanging-fruit" example of something that was best practice
> for standard vectors but is not a good idea for ALTREP vectors is grabbing
> a DATAPTR and iterating over the values without modification in a tight
> loop.  This will work (absent allocation  failure or, I suppose, the ALTREP
> being specifically designed to refuse to give you a full DATAPTR), but with
> ALTREP in place its no longer what you want to do.
>
> That said, you don't want to check whether something is an ALTREP yourself
> and branch your code, what you want to do is use the ITERATE_BY_REGION
> macro in R_ext/Itermacros.h for ALL SEXPs, which will be nearly as for
> standard vectors and work safely for ALTREP vectors.
>
> Basically any time you find yourself wanting to check if something is an
> ALTREP and if so, call a specific ALT*_BLAH method, the intention is that
> there should be a universal API point you can call which will work for both
> types.
>
> This is true, e.g., of INTEGER_IS_SORTED (which will always work and just
> returns UNKNOWN_SORTEDNESS, ie INT_MIN, ie NA_INTEGER for non-ALTREPs).,
> for REAL_GET_REGION, (which populates a double* with the requested values
> for both standard and ALTREP REALSXPs), etc.
>
> Does the above make sense?
>
> If you feel a universal API point is missing, you can raise that here,
> though I can't promise that will ultimately result in the method being
> added.
>
> Best,
> ~G
>
> On Tue, Jun 4, 2019 at 2:22 PM Mark Klik  wrote:
>
>> thanks for clearing that up, so these methods are actually not meant to be
>> exported on Windows and OSX?
>> Some of the ALTREP methods that now use 'attribute_hidden' would be very
>> useful to packages that aim to be ALTREP aware, should the currently
>> (exported) API be considered final?
>>
>> thanks  for your time & best,
>> Mark
>>
>> On Tue, Jun 4, 2019 at 6:52 PM Tierney, Luke 
>> wrote:
>>
>> > On Tue, 4 Jun 2019, Mark Klik wrote:
>> >
>> > > Hello,
>> > >
>> > > I'm developing a package (lazyvec) that makes full use of the ALTREP
>> > > framework (R >= 3.6.0).
>> > > One application of the package is to wrap existing ALTREP vectors in a
>> > new
>> > > ALTREP vector and pass all calls from R to the contained object. The
>> > > purpose of this is to provide a diagnostic framework for working with
>> > > ALTREP vectors and show information about internal calls.
>> > >
>> > > The package builds on Windows and OSX but fails to build on Linux as
>> can
>> > be
>> > > seen from the link to the Travis build:
>> > > https://travis-ci.org/fstpackage/lazyvec/jobs/539442806
>> > >
>> > > The reason of build failure is that many ALTREP methods generate
>> > 'undefined
>> > > symbol' errors upon building the package (on Linux). I've checked the
>> R
>&g

Re: [Rd] [External] undefined symbol errors when compiling package using ALTREP API

2019-06-05 Thread Mark Klik
thanks Luke, I can work with that and will watch out for changes and new
developments in the ALTREP code with great interest.

all the best,
Mark



On Wed, Jun 5, 2019 at 6:02 PM Tierney, Luke  wrote:

> For now you can use
>
> R_altrep_inherits(x, R_compact_intseq_class)
>
> The variable R_compact_intseq_class should currently be visible to
> packages on all platforms, though that may change if we eventually
> provide a string-based lookup mechanism, e.g. somehting like
>
> R_find_altrep_class("compact_intseq", "base")
>
> Best,
>
> luke
>
>
> On Tue, 4 Jun 2019, Mark Klik wrote:
>
> > Hi Gabriel,
> >
> > thanks for your detailed explanation, that definitely clarifies the
> design
> > choices that were made in setting up the ALTREP framework and I can see
> how
> > those choices make sure existing code won't break.
> >
> > My specific use-case for wanting to check whether a vector is an ALTREP
> is
> > the following: the fst package wraps an external C++ library (fstlib,
> > independent from R) that was made for high speed serialization of
> > dataframe's. Sequences are fairly common in dataframe's and I'm planning
> to
> > add the concept of a sequence to the (R-agnostic) fst format. When I can
> > detect, e.g. a 'compact_intseq' ALTREP vector and just retrieve it's 3
> > integer internal representation, serialization could be very fast.
> > Alternatively, as you describe, the vector needs to be expanded first
> > before serialization, which will actually be slower than using an already
> > expanded vector and can take a lot of RAM for large datasets.
> >
> > So being able to make use of the internal representation of (a few of
> the)
> > base ALTREP vectors can be very interesting for (non-R) serialization
> > schemes.
> >
> > thanks for your time!
> > Mark
> >
> >
> > On Tue, Jun 4, 2019 at 11:50 PM Gabriel Becker 
> > wrote:
> >
> >> Hi Mark,
> >>
> >> So depending pretty strongly on what you mean by "ALTREP aware",
> packages
> >> aren't necessarily supposed to be ALTREP aware. What I mean by this is
> that
> >> as of right now, ALTREP objects are designed to be interacted with by
> >> non-ALTREP-implementing package code, *more-or-less *exactly as standard
> >> (non-AR) SEXPs are: via the published C API. The more or less comes from
> >> the fact that in some cases, doing things that are good ideas on
> standard
> >> SEXPS will work, but may not be a good idea for ALTREPs.
> >>
> >> The most "low-hanging-fruit" example of something that was best practice
> >> for standard vectors but is not a good idea for ALTREP vectors is
> grabbing
> >> a DATAPTR and iterating over the values without modification in a tight
> >> loop.  This will work (absent allocation  failure or, I suppose, the
> ALTREP
> >> being specifically designed to refuse to give you a full DATAPTR), but
> with
> >> ALTREP in place its no longer what you want to do.
> >>
> >> That said, you don't want to check whether something is an ALTREP
> yourself
> >> and branch your code, what you want to do is use the ITERATE_BY_REGION
> >> macro in R_ext/Itermacros.h for ALL SEXPs, which will be nearly as for
> >> standard vectors and work safely for ALTREP vectors.
> >>
> >> Basically any time you find yourself wanting to check if something is an
> >> ALTREP and if so, call a specific ALT*_BLAH method, the intention is
> that
> >> there should be a universal API point you can call which will work for
> both
> >> types.
> >>
> >> This is true, e.g., of INTEGER_IS_SORTED (which will always work and
> just
> >> returns UNKNOWN_SORTEDNESS, ie INT_MIN, ie NA_INTEGER for non-ALTREPs).,
> >> for REAL_GET_REGION, (which populates a double* with the requested
> values
> >> for both standard and ALTREP REALSXPs), etc.
> >>
> >> Does the above make sense?
> >>
> >> If you feel a universal API point is missing, you can raise that here,
> >> though I can't promise that will ultimately result in the method being
> >> added.
> >>
> >> Best,
> >> ~G
> >>
> >> On Tue, Jun 4, 2019 at 2:22 PM Mark Klik  wrote:
> >>
> >>> thanks for clearing that up, so these methods are actually not meant
> to be
> >>> exported on Windows and OSX?
> >>> Some of the ALTREP methods that now use 'attribut