[Rd] Small Fix: Greatly Increase Clarity/Utility of R Package Help/Manual Overview Pages

2021-06-29 Thread Marsh, Samuel
?Hi,


I would like to suggest a single line (2 characters) fix that I feel would 
greatly improve the readability and usefulness of the overview R package 
help/manual pages.  Currently the overall help/manual page for a package is 
organized into alphabetized table of contents with linked headers by letter 
only if the package contains more than 100 functions, otherwise the functions 
are simply listed with no line breaks.  This makes for a more difficult user 
experience with moderately sized packages 50-99 functions that could be 
improved.  I would suggest changing the threshold for creating this 
alphabetized table of contents to 50 instead of 100.


I've provided info on the current code that specifies this parameter below:

It would appear that all that needs to be changed is the "> 100" parameter on 
this line:
https://github.com/wch/r-source/blob/80a7ca3b605b34d207ed3465c942f39a37e89f6e/src/library/tools/R/install.R#L2770


Or to list the code directly:

In the .writePkgIndices function the line is:
use_alpha <- (nrow(M) > 100)?


It appears to me the only change needed would be to set "nrow(M) > 50".  I 
believe this very small fix would greatly improve user experience for a growing 
number of moderately-sized packages whose manual/help pages would still greatly 
benefit from greater readability/organization.


Thank you!

Sam


--
Samuel E. Marsh, Ph.D.
Postdoctoral Fellow
Laboratory of Dr. Beth Stevens
F.M. Kirby Neurobiology Research Center
Boston Children's Hospital
Harvard Medical School
samuel.ma...@childrens.harvard.edu

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Small Fix: Greatly Increase Clarity/Utility of R Package Help/Manual Overview Pages

2021-06-29 Thread Sebastian Meyer
Just in case others (like me) don't instantly know what this is about.

This only affects the html help.
Compare the HTML index page for the base package "graphics"

https://stat.ethz.ch/R-manual/R-patched/library/graphics/html/00Index.html

with the index page for the base package "grDevices"

https://stat.ethz.ch/R-manual/R-patched/library/grDevices/html/00Index.html

The latter is split by first letter, the former isn't as it only lists
98 <= 100 help pages.

I don't really have a preference for either (but I also rarely use the
html help system). Using a threshold seems reasonable to avoid blowing
up the index of a small package like "splines" but provide some anchors
for a large package like "stats". 50 may be too low as a threshold.
Looking at package "parallel" with its 45 entries

https://stat.ethz.ch/R-manual/R-patched/library/parallel/html/00Index.html

the listing doesn't seem long enough to benefit from alphabetic
sectioning. Probably a matter of taste.

Best regards,

Sebastian


Am 29.06.21 um 13:07 schrieb Marsh, Samuel:
> ?Hi,
> 
> 
> I would like to suggest a single line (2 characters) fix that I feel would 
> greatly improve the readability and usefulness of the overview R package 
> help/manual pages.  Currently the overall help/manual page for a package is 
> organized into alphabetized table of contents with linked headers by letter 
> only if the package contains more than 100 functions, otherwise the functions 
> are simply listed with no line breaks.  This makes for a more difficult user 
> experience with moderately sized packages 50-99 functions that could be 
> improved.  I would suggest changing the threshold for creating this 
> alphabetized table of contents to 50 instead of 100.
> 
> 
> I've provided info on the current code that specifies this parameter below:
> 
> It would appear that all that needs to be changed is the "> 100" parameter on 
> this line:
> https://github.com/wch/r-source/blob/80a7ca3b605b34d207ed3465c942f39a37e89f6e/src/library/tools/R/install.R#L2770
> 
> 
> Or to list the code directly:
> 
> In the .writePkgIndices function the line is:
> use_alpha <- (nrow(M) > 100)?
> 
> 
> It appears to me the only change needed would be to set "nrow(M) > 50".  I 
> believe this very small fix would greatly improve user experience for a 
> growing number of moderately-sized packages whose manual/help pages would 
> still greatly benefit from greater readability/organization.
> 
> 
> Thank you!
> 
> Sam
> 
> 
> --
> Samuel E. Marsh, Ph.D.
> Postdoctoral Fellow
> Laboratory of Dr. Beth Stevens
> F.M. Kirby Neurobiology Research Center
> Boston Children's Hospital
> Harvard Medical School
> samuel.ma...@childrens.harvard.edu
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread Sebastian Martin Krantz
Hello together, I'm working on some custom (grouped, weighted) sum, min and
max functions and I want them to support the special case of plain integer
sequences using ALTREP. I thereby encountered some behavior I cannot
explain to myself. The head of my fsum C function looks like this (g is
optional grouping vector, w is optional weights vector):

SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
  int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
  if(ALTREP(x) && ng == 0 && nwl) {
switch(tx) {
case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
default: error("ALTREP object must be integer or real typed");
}
  }
// ...
}

when I let x <- 1:1e8, fsum(x) works fine and returns the correct value. If
I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn this
into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and a warning
message 'converting NULL pointer to R NULL'. For functions fmin and fmax
(similarly defined using ALTINTEGER_MIN/MAX), I get this error right away
e.g. fmin(1:1e8) gives NULL and warning 'converting NULL pointer to R
NULL'. So what is going on here? What do these functions return? And how do
I make this a robust implementation?

Best regards,

Sebastian Krantz

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread luke-tierney

ALTINTEGER_SUM and friends are _not_ intended for use in package code.
Once we get some time to clean up headers they will no longer be
visible to packages.

Best,

luke

On Tue, 29 Jun 2021, Sebastian Martin Krantz wrote:


Hello together, I'm working on some custom (grouped, weighted) sum, min and
max functions and I want them to support the special case of plain integer
sequences using ALTREP. I thereby encountered some behavior I cannot
explain to myself. The head of my fsum C function looks like this (g is
optional grouping vector, w is optional weights vector):

SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
 int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
   narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
 if(ALTREP(x) && ng == 0 && nwl) {
   switch(tx) {
   case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
   case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
   case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
   default: error("ALTREP object must be integer or real typed");
   }
 }
// ...
}

when I let x <- 1:1e8, fsum(x) works fine and returns the correct value. If
I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn this
into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and a warning
message 'converting NULL pointer to R NULL'. For functions fmin and fmax
(similarly defined using ALTINTEGER_MIN/MAX), I get this error right away
e.g. fmin(1:1e8) gives NULL and warning 'converting NULL pointer to R
NULL'. So what is going on here? What do these functions return? And how do
I make this a robust implementation?

Best regards,

Sebastian Krantz

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread Bill Dunlap
Adding the dimensions attribute takes away the altrep-ness.  Removing
dimensions
does not make it altrep.  E.g.,

> a <- 1:10
> am <- a ; dim(am) <- c(2L,5L)
> amn <- am ; dim(amn) <- NULL
> .Call("is_altrep", a)
[1] TRUE
> .Call("is_altrep", am)
[1] FALSE
> .Call("is_altrep", amn)
[1] FALSE

where is_altrep() is defined by the following C code:

#include 
#include 

SEXP is_altrep(SEXP x)
{
return Rf_ScalarLogical(ALTREP(x));
}

-Bill

On Tue, Jun 29, 2021 at 8:03 AM Sebastian Martin Krantz <
sebastian.kra...@graduateinstitute.ch> wrote:

> Hello together, I'm working on some custom (grouped, weighted) sum, min and
> max functions and I want them to support the special case of plain integer
> sequences using ALTREP. I thereby encountered some behavior I cannot
> explain to myself. The head of my fsum C function looks like this (g is
> optional grouping vector, w is optional weights vector):
>
> SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
>   int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
> narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
>   if(ALTREP(x) && ng == 0 && nwl) {
> switch(tx) {
> case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
> case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
> case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
> default: error("ALTREP object must be integer or real typed");
> }
>   }
> // ...
> }
>
> when I let x <- 1:1e8, fsum(x) works fine and returns the correct value. If
> I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn this
> into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and a warning
> message 'converting NULL pointer to R NULL'. For functions fmin and fmax
> (similarly defined using ALTINTEGER_MIN/MAX), I get this error right away
> e.g. fmin(1:1e8) gives NULL and warning 'converting NULL pointer to R
> NULL'. So what is going on here? What do these functions return? And how do
> I make this a robust implementation?
>
> Best regards,
>
> Sebastian Krantz
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread Sebastian Martin Krantz
Thanks both. Is there a suggested way I can get this speedup in a package?
Or just leave it for now?

Thanks also for the clarification Bill. The issue I have with that is that
in my C code ALTREP(x) evaluates to true even after adding and removing
dimensions (otherwise it would be handled by the normal sum method and I’d
be fine). Also .Internal(inspect(x)) still shows the compact
representation.

-Sebastian

On Tue 29. Jun 2021 at 19:43, Bill Dunlap  wrote:

> Adding the dimensions attribute takes away the altrep-ness.  Removing
> dimensions
> does not make it altrep.  E.g.,
>
> > a <- 1:10
> > am <- a ; dim(am) <- c(2L,5L)
> > amn <- am ; dim(amn) <- NULL
> > .Call("is_altrep", a)
> [1] TRUE
> > .Call("is_altrep", am)
> [1] FALSE
> > .Call("is_altrep", amn)
> [1] FALSE
>
> where is_altrep() is defined by the following C code:
>
> #include 
> #include 
>
> SEXP is_altrep(SEXP x)
> {
> return Rf_ScalarLogical(ALTREP(x));
> }
>
>
> -Bill
>
> On Tue, Jun 29, 2021 at 8:03 AM Sebastian Martin Krantz <
> sebastian.kra...@graduateinstitute.ch> wrote:
>
>> Hello together, I'm working on some custom (grouped, weighted) sum, min
>> and
>> max functions and I want them to support the special case of plain integer
>> sequences using ALTREP. I thereby encountered some behavior I cannot
>> explain to myself. The head of my fsum C function looks like this (g is
>> optional grouping vector, w is optional weights vector):
>>
>> SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
>>   int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
>> narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
>>   if(ALTREP(x) && ng == 0 && nwl) {
>> switch(tx) {
>> case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
>> case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
>> case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
>> default: error("ALTREP object must be integer or real typed");
>> }
>>   }
>> // ...
>> }
>>
>> when I let x <- 1:1e8, fsum(x) works fine and returns the correct value.
>> If
>> I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn this
>> into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and a warning
>> message 'converting NULL pointer to R NULL'. For functions fmin and fmax
>> (similarly defined using ALTINTEGER_MIN/MAX), I get this error right away
>> e.g. fmin(1:1e8) gives NULL and warning 'converting NULL pointer to R
>> NULL'. So what is going on here? What do these functions return? And how
>> do
>> I make this a robust implementation?
>>
>> Best regards,
>>
>> Sebastian Krantz
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Re: ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread luke-tierney

It depends on the size. For a larger vector adding dim will create a
wrapper ALTREP.

Currently the wrapper does not try to use the payload's sum method;
this could be added.

Best,

luke

On Tue, 29 Jun 2021, Bill Dunlap wrote:


Adding the dimensions attribute takes away the altrep-ness.  Removing
dimensions
does not make it altrep.  E.g.,


a <- 1:10
am <- a ; dim(am) <- c(2L,5L)
amn <- am ; dim(amn) <- NULL
.Call("is_altrep", a)

[1] TRUE

.Call("is_altrep", am)

[1] FALSE

.Call("is_altrep", amn)

[1] FALSE

where is_altrep() is defined by the following C code:

#include 
#include 

SEXP is_altrep(SEXP x)
{
   return Rf_ScalarLogical(ALTREP(x));
}

-Bill

On Tue, Jun 29, 2021 at 8:03 AM Sebastian Martin Krantz <
sebastian.kra...@graduateinstitute.ch> wrote:


Hello together, I'm working on some custom (grouped, weighted) sum, min and
max functions and I want them to support the special case of plain integer
sequences using ALTREP. I thereby encountered some behavior I cannot
explain to myself. The head of my fsum C function looks like this (g is
optional grouping vector, w is optional weights vector):

SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
  int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
  if(ALTREP(x) && ng == 0 && nwl) {
switch(tx) {
case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
default: error("ALTREP object must be integer or real typed");
}
  }
// ...
}

when I let x <- 1:1e8, fsum(x) works fine and returns the correct value. If
I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn this
into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and a warning
message 'converting NULL pointer to R NULL'. For functions fmin and fmax
(similarly defined using ALTINTEGER_MIN/MAX), I get this error right away
e.g. fmin(1:1e8) gives NULL and warning 'converting NULL pointer to R
NULL'. So what is going on here? What do these functions return? And how do
I make this a robust implementation?

Best regards,

Sebastian Krantz

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Re: ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread luke-tierney

Call the R sum() function, either before going to C code or by calling
back into R. You may only want to do this if the vector is long enough
for e possible savings to be worth while.


On Tue, 29 Jun 2021, Sebastian Martin Krantz wrote:


Thanks both. Is there a suggested way I can get this speedup in a package?
Or just leave it for now?

Thanks also for the clarification Bill. The issue I have with that is that
in my C code ALTREP(x) evaluates to true even after adding and removing
dimensions (otherwise it would be handled by the normal sum method and I’d
be fine).


When you use a longer vector


Also .Internal(inspect(x)) still shows the compact
representation.


A different representation (wrapper around a compact sequence).

Best,

luke



-Sebastian

On Tue 29. Jun 2021 at 19:43, Bill Dunlap  wrote:


Adding the dimensions attribute takes away the altrep-ness.  Removing
dimensions
does not make it altrep.  E.g.,


a <- 1:10
am <- a ; dim(am) <- c(2L,5L)
amn <- am ; dim(amn) <- NULL
.Call("is_altrep", a)

[1] TRUE

.Call("is_altrep", am)

[1] FALSE

.Call("is_altrep", amn)

[1] FALSE

where is_altrep() is defined by the following C code:

#include 
#include 

SEXP is_altrep(SEXP x)
{
return Rf_ScalarLogical(ALTREP(x));
}


-Bill

On Tue, Jun 29, 2021 at 8:03 AM Sebastian Martin Krantz <
sebastian.kra...@graduateinstitute.ch> wrote:


Hello together, I'm working on some custom (grouped, weighted) sum, min
and
max functions and I want them to support the special case of plain integer
sequences using ALTREP. I thereby encountered some behavior I cannot
explain to myself. The head of my fsum C function looks like this (g is
optional grouping vector, w is optional weights vector):

SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
  int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
  if(ALTREP(x) && ng == 0 && nwl) {
switch(tx) {
case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
default: error("ALTREP object must be integer or real typed");
}
  }
// ...
}

when I let x <- 1:1e8, fsum(x) works fine and returns the correct value.
If
I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn this
into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and a warning
message 'converting NULL pointer to R NULL'. For functions fmin and fmax
(similarly defined using ALTINTEGER_MIN/MAX), I get this error right away
e.g. fmin(1:1e8) gives NULL and warning 'converting NULL pointer to R
NULL'. So what is going on here? What do these functions return? And how
do
I make this a robust implementation?

Best regards,

Sebastian Krantz

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread Gabriel Becker
Hi Sebastian,

So the way that it is currently factored, there isn't a good way of getting
what you want under the constraints of what Luke said (ALTINTEGER_SUM is
not part of the API).

I don't know what his reason are for saying that per say and would not want
to speak for him, but of the top of my head, I suspect it is because ALTREP
sum methods are allowed to return NULL (the C version) to say "I don't have
a sum method that is applicable here, please continue with the normal
code". So, just as an example, your exact code is likely to segfault, I
think, if you hit an ALTREP that chooses not to implement a sum method
because you'll be running around with a SEXP that has the value NULL (the C
one, not the R one).

One thing you could do, is check for altrepness and then construct and
evaluate a call to the R sum function in that case, but that probably isn't
quite what you want either, as this will hit the code you're trying to
bypass/speedup  in the case where the ALTREP class doesn't implement a sum
methods. I see that Luke just mentioned this as well but I'll leave it in
since I had already typed it.

I hope that helps clarify some things.

Best,
~G


On Tue, Jun 29, 2021 at 10:13 AM Sebastian Martin Krantz <
sebastian.kra...@graduateinstitute.ch> wrote:

> Thanks both. Is there a suggested way I can get this speedup in a package?
> Or just leave it for now?
>
> Thanks also for the clarification Bill. The issue I have with that is that
> in my C code ALTREP(x) evaluates to true even after adding and removing
> dimensions (otherwise it would be handled by the normal sum method and I’d
> be fine). Also .Internal(inspect(x)) still shows the compact
> representation.
>
> -Sebastian
>
> On Tue 29. Jun 2021 at 19:43, Bill Dunlap 
> wrote:
>
> > Adding the dimensions attribute takes away the altrep-ness.  Removing
> > dimensions
> > does not make it altrep.  E.g.,
> >
> > > a <- 1:10
> > > am <- a ; dim(am) <- c(2L,5L)
> > > amn <- am ; dim(amn) <- NULL
> > > .Call("is_altrep", a)
> > [1] TRUE
> > > .Call("is_altrep", am)
> > [1] FALSE
> > > .Call("is_altrep", amn)
> > [1] FALSE
> >
> > where is_altrep() is defined by the following C code:
> >
> > #include 
> > #include 
> >
> > SEXP is_altrep(SEXP x)
> > {
> > return Rf_ScalarLogical(ALTREP(x));
> > }
> >
> >
> > -Bill
> >
> > On Tue, Jun 29, 2021 at 8:03 AM Sebastian Martin Krantz <
> > sebastian.kra...@graduateinstitute.ch> wrote:
> >
> >> Hello together, I'm working on some custom (grouped, weighted) sum, min
> >> and
> >> max functions and I want them to support the special case of plain
> integer
> >> sequences using ALTREP. I thereby encountered some behavior I cannot
> >> explain to myself. The head of my fsum C function looks like this (g is
> >> optional grouping vector, w is optional weights vector):
> >>
> >> SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
> >>   int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
> >> narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
> >>   if(ALTREP(x) && ng == 0 && nwl) {
> >> switch(tx) {
> >> case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
> >> case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
> >> case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
> >> default: error("ALTREP object must be integer or real typed");
> >> }
> >>   }
> >> // ...
> >> }
> >>
> >> when I let x <- 1:1e8, fsum(x) works fine and returns the correct value.
> >> If
> >> I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn
> this
> >> into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and a warning
> >> message 'converting NULL pointer to R NULL'. For functions fmin and fmax
> >> (similarly defined using ALTINTEGER_MIN/MAX), I get this error right
> away
> >> e.g. fmin(1:1e8) gives NULL and warning 'converting NULL pointer to R
> >> NULL'. So what is going on here? What do these functions return? And how
> >> do
> >> I make this a robust implementation?
> >>
> >> Best regards,
> >>
> >> Sebastian Krantz
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread Gabriel Becker
Also, @Luke Tierney   I can prepare a patch that
has wrappers delegate to payload's ALTREP class methods for things like
sum, min, max, etc once conference season calms down a bit.

Best,
~G

On Tue, Jun 29, 2021 at 11:07 AM Gabriel Becker 
wrote:

> Hi Sebastian,
>
> So the way that it is currently factored, there isn't a good way of
> getting what you want under the constraints of what Luke said (ALTINTEGER_SUM
> is not part of the API).
>
> I don't know what his reason are for saying that per say and would not
> want to speak for him, but of the top of my head, I suspect it is because
> ALTREP sum methods are allowed to return NULL (the C version) to say "I
> don't have a sum method that is applicable here, please continue with the
> normal code". So, just as an example, your exact code is likely to
> segfault, I think, if you hit an ALTREP that chooses not to implement a sum
> method because you'll be running around with a SEXP that has the value NULL
> (the C one, not the R one).
>
> One thing you could do, is check for altrepness and then construct and
> evaluate a call to the R sum function in that case, but that probably isn't
> quite what you want either, as this will hit the code you're trying to
> bypass/speedup  in the case where the ALTREP class doesn't implement a sum
> methods. I see that Luke just mentioned this as well but I'll leave it in
> since I had already typed it.
>
> I hope that helps clarify some things.
>
> Best,
> ~G
>
>
> On Tue, Jun 29, 2021 at 10:13 AM Sebastian Martin Krantz <
> sebastian.kra...@graduateinstitute.ch> wrote:
>
>> Thanks both. Is there a suggested way I can get this speedup in a package?
>> Or just leave it for now?
>>
>> Thanks also for the clarification Bill. The issue I have with that is that
>> in my C code ALTREP(x) evaluates to true even after adding and removing
>> dimensions (otherwise it would be handled by the normal sum method and I’d
>> be fine). Also .Internal(inspect(x)) still shows the compact
>> representation.
>>
>> -Sebastian
>>
>> On Tue 29. Jun 2021 at 19:43, Bill Dunlap 
>> wrote:
>>
>> > Adding the dimensions attribute takes away the altrep-ness.  Removing
>> > dimensions
>> > does not make it altrep.  E.g.,
>> >
>> > > a <- 1:10
>> > > am <- a ; dim(am) <- c(2L,5L)
>> > > amn <- am ; dim(amn) <- NULL
>> > > .Call("is_altrep", a)
>> > [1] TRUE
>> > > .Call("is_altrep", am)
>> > [1] FALSE
>> > > .Call("is_altrep", amn)
>> > [1] FALSE
>> >
>> > where is_altrep() is defined by the following C code:
>> >
>> > #include 
>> > #include 
>> >
>> > SEXP is_altrep(SEXP x)
>> > {
>> > return Rf_ScalarLogical(ALTREP(x));
>> > }
>> >
>> >
>> > -Bill
>> >
>> > On Tue, Jun 29, 2021 at 8:03 AM Sebastian Martin Krantz <
>> > sebastian.kra...@graduateinstitute.ch> wrote:
>> >
>> >> Hello together, I'm working on some custom (grouped, weighted) sum, min
>> >> and
>> >> max functions and I want them to support the special case of plain
>> integer
>> >> sequences using ALTREP. I thereby encountered some behavior I cannot
>> >> explain to myself. The head of my fsum C function looks like this (g is
>> >> optional grouping vector, w is optional weights vector):
>> >>
>> >> SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
>> >>   int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
>> >> narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
>> >>   if(ALTREP(x) && ng == 0 && nwl) {
>> >> switch(tx) {
>> >> case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
>> >> case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
>> >> case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
>> >> default: error("ALTREP object must be integer or real typed");
>> >> }
>> >>   }
>> >> // ...
>> >> }
>> >>
>> >> when I let x <- 1:1e8, fsum(x) works fine and returns the correct
>> value.
>> >> If
>> >> I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn
>> this
>> >> into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and a warning
>> >> message 'converting NULL pointer to R NULL'. For functions fmin and
>> fmax
>> >> (similarly defined using ALTINTEGER_MIN/MAX), I get this error right
>> away
>> >> e.g. fmin(1:1e8) gives NULL and warning 'converting NULL pointer to R
>> >> NULL'. So what is going on here? What do these functions return? And
>> how
>> >> do
>> >> I make this a robust implementation?
>> >>
>> >> Best regards,
>> >>
>> >> Sebastian Krantz
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-devel@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/

Re: [Rd] ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread Gabriel Becker
Hi Sebastian,

min/max do not materialize the vector, you will see it as compact after
same as before. It *does* however do a pass over the data chunked by
region, which is much more expensive than it need be for compact sequences,
that is true.

I think in some version of code that never made it out of the branch, I had
default min/max methods which took sortedness into account if it was known.
One thing that significantly complicated that cod ewas that you have to
find the edge of the NAs(/NaNs for the real case) if narm is TRUE, which
involves a binary search using ELT (or a linear one using
ITERATE_BY_REGION, I suppose).

That said a newer version of the count nas code did get in from a later
patch  to update, so it is available in r-devel and could be used to
revisit that approach.

That aside, it is true that compact sequences in particular never have NAs
so the min and max altrep methods for those classes would be trivial. I
kind of doubt people are creating compact sequences and then asking for the
min/max/mean of them very often in practice.

Best,
~G

On Tue, Jun 29, 2021 at 11:26 AM Sebastian Martin Krantz <
sebastian.kra...@graduateinstitute.ch> wrote:

> Thanks Gabriel and Luke,
>
> I understand now the functions return NULL if no method is applicable. I
> wonder though why do ALTINTEGER_MIN and MAX return NULL on a plain integer
> sequence? I also see that min() and max() are not optimized i.e. min(1:1e8)
> appears to materialize the vector.
>
> In general I expect my functions to mostly be applied to real data so this
> is not a huge issue for me (I’d rather get rid of it again than calling
> sum() or risking that the macros are removed from the API), but it could be
> nice to have this speedup available to packages. If these macros have
> matured and it can be made explicit that they return NULL if no method is
> applicable, or, better, they internally dispatch to a normal sum method if
> this is the case, they could become very manageable and useful.
>
> Best,
>
> Sebastian
>
>
>
> On Tue 29. Jun 2021 at 21:09, Gabriel Becker 
> wrote:
>
>> Also, @Luke Tierney   I can prepare a patch that
>> has wrappers delegate to payload's ALTREP class methods for things like
>> sum, min, max, etc once conference season calms down a bit.
>>
>> Best,
>> ~G
>>
>> On Tue, Jun 29, 2021 at 11:07 AM Gabriel Becker 
>> wrote:
>>
>>> Hi Sebastian,
>>>
>>> So the way that it is currently factored, there isn't a good way of
>>> getting what you want under the constraints of what Luke said 
>>> (ALTINTEGER_SUM
>>> is not part of the API).
>>>
>>> I don't know what his reason are for saying that per say and would not
>>> want to speak for him, but of the top of my head, I suspect it is because
>>> ALTREP sum methods are allowed to return NULL (the C version) to say "I
>>> don't have a sum method that is applicable here, please continue with the
>>> normal code". So, just as an example, your exact code is likely to
>>> segfault, I think, if you hit an ALTREP that chooses not to implement a sum
>>> method because you'll be running around with a SEXP that has the value NULL
>>> (the C one, not the R one).
>>>
>>> One thing you could do, is check for altrepness and then construct and
>>> evaluate a call to the R sum function in that case, but that probably isn't
>>> quite what you want either, as this will hit the code you're trying to
>>> bypass/speedup  in the case where the ALTREP class doesn't implement a sum
>>> methods. I see that Luke just mentioned this as well but I'll leave it in
>>> since I had already typed it.
>>>
>>> I hope that helps clarify some things.
>>>
>>> Best,
>>> ~G
>>>
>>>
>>> On Tue, Jun 29, 2021 at 10:13 AM Sebastian Martin Krantz <
>>> sebastian.kra...@graduateinstitute.ch> wrote:
>>>
 Thanks both. Is there a suggested way I can get this speedup in a
 package?
 Or just leave it for now?

 Thanks also for the clarification Bill. The issue I have with that is
 that
 in my C code ALTREP(x) evaluates to true even after adding and removing
 dimensions (otherwise it would be handled by the normal sum method and
 I’d
 be fine). Also .Internal(inspect(x)) still shows the compact
 representation.

 -Sebastian

 On Tue 29. Jun 2021 at 19:43, Bill Dunlap 
 wrote:

 > Adding the dimensions attribute takes away the altrep-ness.  Removing
 > dimensions
 > does not make it altrep.  E.g.,
 >
 > > a <- 1:10
 > > am <- a ; dim(am) <- c(2L,5L)
 > > amn <- am ; dim(amn) <- NULL
 > > .Call("is_altrep", a)
 > [1] TRUE
 > > .Call("is_altrep", am)
 > [1] FALSE
 > > .Call("is_altrep", amn)
 > [1] FALSE
 >
 > where is_altrep() is defined by the following C code:
 >
 > #include 
 > #include 
 >
 > SEXP is_altrep(SEXP x)
 > {
 > return Rf_ScalarLogical(ALTREP(x));
 > }
 >
 >
 > -Bill
 >
 > On Tue, Jun 29, 2021 at 8:03 AM Sebastian Martin 

Re: [Rd] ALTREP ALTINTEGER_SUM/MIN/MAX Return Value and Behavior

2021-06-29 Thread Sebastian Martin Krantz
Thanks Gabriel and Luke,

I understand now the functions return NULL if no method is applicable. I
wonder though why do ALTINTEGER_MIN and MAX return NULL on a plain integer
sequence? I also see that min() and max() are not optimized i.e. min(1:1e8)
appears to materialize the vector.

In general I expect my functions to mostly be applied to real data so this
is not a huge issue for me (I’d rather get rid of it again than calling
sum() or risking that the macros are removed from the API), but it could be
nice to have this speedup available to packages. If these macros have
matured and it can be made explicit that they return NULL if no method is
applicable, or, better, they internally dispatch to a normal sum method if
this is the case, they could become very manageable and useful.

Best,

Sebastian



On Tue 29. Jun 2021 at 21:09, Gabriel Becker  wrote:

> Also, @Luke Tierney   I can prepare a patch that
> has wrappers delegate to payload's ALTREP class methods for things like
> sum, min, max, etc once conference season calms down a bit.
>
> Best,
> ~G
>
> On Tue, Jun 29, 2021 at 11:07 AM Gabriel Becker 
> wrote:
>
>> Hi Sebastian,
>>
>> So the way that it is currently factored, there isn't a good way of
>> getting what you want under the constraints of what Luke said (ALTINTEGER_SUM
>> is not part of the API).
>>
>> I don't know what his reason are for saying that per say and would not
>> want to speak for him, but of the top of my head, I suspect it is because
>> ALTREP sum methods are allowed to return NULL (the C version) to say "I
>> don't have a sum method that is applicable here, please continue with the
>> normal code". So, just as an example, your exact code is likely to
>> segfault, I think, if you hit an ALTREP that chooses not to implement a sum
>> method because you'll be running around with a SEXP that has the value NULL
>> (the C one, not the R one).
>>
>> One thing you could do, is check for altrepness and then construct and
>> evaluate a call to the R sum function in that case, but that probably isn't
>> quite what you want either, as this will hit the code you're trying to
>> bypass/speedup  in the case where the ALTREP class doesn't implement a sum
>> methods. I see that Luke just mentioned this as well but I'll leave it in
>> since I had already typed it.
>>
>> I hope that helps clarify some things.
>>
>> Best,
>> ~G
>>
>>
>> On Tue, Jun 29, 2021 at 10:13 AM Sebastian Martin Krantz <
>> sebastian.kra...@graduateinstitute.ch> wrote:
>>
>>> Thanks both. Is there a suggested way I can get this speedup in a
>>> package?
>>> Or just leave it for now?
>>>
>>> Thanks also for the clarification Bill. The issue I have with that is
>>> that
>>> in my C code ALTREP(x) evaluates to true even after adding and removing
>>> dimensions (otherwise it would be handled by the normal sum method and
>>> I’d
>>> be fine). Also .Internal(inspect(x)) still shows the compact
>>> representation.
>>>
>>> -Sebastian
>>>
>>> On Tue 29. Jun 2021 at 19:43, Bill Dunlap 
>>> wrote:
>>>
>>> > Adding the dimensions attribute takes away the altrep-ness.  Removing
>>> > dimensions
>>> > does not make it altrep.  E.g.,
>>> >
>>> > > a <- 1:10
>>> > > am <- a ; dim(am) <- c(2L,5L)
>>> > > amn <- am ; dim(amn) <- NULL
>>> > > .Call("is_altrep", a)
>>> > [1] TRUE
>>> > > .Call("is_altrep", am)
>>> > [1] FALSE
>>> > > .Call("is_altrep", amn)
>>> > [1] FALSE
>>> >
>>> > where is_altrep() is defined by the following C code:
>>> >
>>> > #include 
>>> > #include 
>>> >
>>> > SEXP is_altrep(SEXP x)
>>> > {
>>> > return Rf_ScalarLogical(ALTREP(x));
>>> > }
>>> >
>>> >
>>> > -Bill
>>> >
>>> > On Tue, Jun 29, 2021 at 8:03 AM Sebastian Martin Krantz <
>>> > sebastian.kra...@graduateinstitute.ch> wrote:
>>> >
>>> >> Hello together, I'm working on some custom (grouped, weighted) sum,
>>> min
>>> >> and
>>> >> max functions and I want them to support the special case of plain
>>> integer
>>> >> sequences using ALTREP. I thereby encountered some behavior I cannot
>>> >> explain to myself. The head of my fsum C function looks like this (g
>>> is
>>> >> optional grouping vector, w is optional weights vector):
>>> >>
>>> >> SEXP fsumC(SEXP x, SEXP Rng, SEXP g, SEXP w, SEXP Rnarm) {
>>> >>   int l = length(x), tx = TYPEOF(x), ng = asInteger(Rng),
>>> >> narm = asLogical(Rnarm), nprotect = 1, nwl = isNull(w);
>>> >>   if(ALTREP(x) && ng == 0 && nwl) {
>>> >> switch(tx) {
>>> >> case INTSXP: return ALTINTEGER_SUM(x, (Rboolean)narm);
>>> >> case LGLSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
>>> >> case REALSXP: return ALTLOGICAL_SUM(x, (Rboolean)narm);
>>> >> default: error("ALTREP object must be integer or real typed");
>>> >> }
>>> >>   }
>>> >> // ...
>>> >> }
>>> >>
>>> >> when I let x <- 1:1e8, fsum(x) works fine and returns the correct
>>> value.
>>> >> If
>>> >> I now make this a matrix dim(x) <- c(1e2, 1e6) and subsequently turn
>>> this
>>> >> into a vector again, dim(x) <- NULL, fsum(x) gives  NULL and