[Rd] Two ALTREP questions

2020-11-21 Thread Jiefei Wang
Hello,

I have two related ALTREP questions. It seems like there is no way to
assign attributes to an ALTREP vector without using C++ code. To be more
specifically, I want to make an ALTREP matrix, I have tried the following R
code but none of them work.
```
.Internal(inspect(1:6))
.Internal(inspect(matrix(1:6, 2,3)))
.Internal(inspect(as.matrix(1:6)))
.Internal(inspect(structure(1:6, dim = c(2L,3L
.Internal(inspect({x <- 1:6;attr(x, "dim") <- c(2L,3L);x}))
.Internal(inspect({x <- 1:6;attributes(x)<- list(dim = c(2L,3L));x}))
```

The only way to make an ALTREP matrix is to use the C level function
```
attachAttrib <- inline::cxxfunction( signature(x = "SEXP", attr = "SEXP" )
, '
SET_ATTRIB(x,attr);
return(R_NilValue);
')
x <- 1:6
attachAttrib(x, pairlist(dim = c(2L, 3L)))
.Internal(inspect(x))
```

Since the matrix, or adding attributes, is a common need for the object
operation, I wonder if this missing feature is intended? This also brings
my second question, it seems like the ALTREP coercion function does not
handle attributes correctly.  After the coercion, the ALTREP object will
lose its attributes.
```
coerceFunc <- inline::cxxfunction( signature(x = "SEXP", attr = "SEXP" ) , '
SET_ATTRIB(x,attr);
return(Rf_coerceVector(x, REALSXP));
')
> coerceFunc(1:6, pairlist(dim = c(2L, 3L)))
[1] 1 2 3 4 5 6
> coerceFunc(1:6 + 0L, pairlist(dim = c(2L, 3L)))
 [,1] [,2] [,3]
[1,]135
[2,]246
```
The problem is that the coercion function is directly dispatched to the
user-defined ALTREP coercion function, so the user is responsible to attach
the attributes after the coercion. If he forgets to do so, then the result
is a plain vector. Similar to the `Duplicate` and `DuplicateEX` functions
where the former one will attach the attributes by default, I feel that the
`Coerce` function should only return a plain vector and there should be a
`CoerceEx` function to do the attribute assignment, so the logic in the
no-EX ALTREP functions can be consistent. I do not know how dramastic the
change would be, so maybe this is too hard to do.

BTW, is there any way to contribute to the R source? I know R has a limited
resouces, so if possible, I will be happy to fix the matrix issue myself
and make some minor contributions to the R community.

Best,
Jiefei

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Error in unsplit() with tibbles

2020-11-21 Thread Mario Annau
Hello,

using the `unsplit()` function with tibbles currently leads to the
following error:

> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
> s <- split(mtcars_tb, mtcars_tb$gear)
> unsplit(s, mtcars_tb$gear)
 Error: Must subset rows with a valid subscript vector.
ℹ Logical subscripts must match the size of the indexed input.
x Input has size 15 but subscript `rep(NA, len)` has size 32.
Run `rlang::last_error()` to see where the error occurred.

Tibble seems to (rightly) complain, that a logical vector has been used for
subsetting which does not have the same length as the data.frame (rows).
Since `NA` is a logical value, the subset should be changed to
`NA_integer_` in `unsplit()`:

> unsplit
function (value, f, drop = FALSE)
{
len <- length(if (is.list(f)) f[[1L]] else f)
if (is.data.frame(value[[1L]])) {
x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
}
else x <- value[[1L]][rep(NA, len)]
split(x, f, drop = drop) <- value
x
}

Cheers,
Mario

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in unsplit() with tibbles

2020-11-21 Thread Marc Schwartz via R-devel


> On Nov 21, 2020, at 10:55 AM, Mario Annau  wrote:
> 
> Hello,
> 
> using the `unsplit()` function with tibbles currently leads to the
> following error:
> 
>> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
>> s <- split(mtcars_tb, mtcars_tb$gear)
>> unsplit(s, mtcars_tb$gear)
> Error: Must subset rows with a valid subscript vector.
> ℹ Logical subscripts must match the size of the indexed input.
> x Input has size 15 but subscript `rep(NA, len)` has size 32.
> Run `rlang::last_error()` to see where the error occurred.
> 
> Tibble seems to (rightly) complain, that a logical vector has been used for
> subsetting which does not have the same length as the data.frame (rows).
> Since `NA` is a logical value, the subset should be changed to
> `NA_integer_` in `unsplit()`:
> 
>> unsplit
> function (value, f, drop = FALSE)
> {
>len <- length(if (is.list(f)) f[[1L]] else f)
>if (is.data.frame(value[[1L]])) {
>x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
>rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
>}
>else x <- value[[1L]][rep(NA, len)]
>split(x, f, drop = drop) <- value
>x
> }
> 
> Cheers,
> Mario


Hi,

Perhaps I am missing something, but if you are using objects, like tibbles, 
that are intended to be part of another environment, in this case the 
tidyverse, why would you not use functions to manipulate these objects that 
were specifically created in the other environment?

I don't use the tidyverse, but it seems to me that to expect base R functions 
to work with objects not created in base R, is problematic, even though, 
perhaps by coincidence, they may work without adverse effects, as appears to be 
the case with split(). 

In other words, you should not, in reality, have had an a priori expectation 
that split() would work with a tibble either.

Rather than modifying the base R functions, like unsplit(), as you are 
suggesting, to be compatible with these third party objects, the burden should 
either be on you to use relevant tidyverse functions, or on the authors of the 
tidyverse to provide relevant class methods to provide that functionality.

Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in unsplit() with tibbles

2020-11-21 Thread Peter Dalgaard
Yes. Nevermind tibbles, the [rep(NA, len),] construction only happens to work 
because len will always be >= the number of rows in  value[[1L]], witness

> (1:10)[rep(NA, 20)]
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> (1:20)[rep(NA, 10)]
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> (1:20)[rep(NA_integer_, 10)]
 [1] NA NA NA NA NA NA NA NA NA NA
> (1:10)[rep(NA_integer_, 20)]
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

-pd


> On 21 Nov 2020, at 16:55 , Mario Annau  wrote:
> 
> Hello,
> 
> using the `unsplit()` function with tibbles currently leads to the
> following error:
> 
>> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
>> s <- split(mtcars_tb, mtcars_tb$gear)
>> unsplit(s, mtcars_tb$gear)
> Error: Must subset rows with a valid subscript vector.
> ℹ Logical subscripts must match the size of the indexed input.
> x Input has size 15 but subscript `rep(NA, len)` has size 32.
> Run `rlang::last_error()` to see where the error occurred.
> 
> Tibble seems to (rightly) complain, that a logical vector has been used for
> subsetting which does not have the same length as the data.frame (rows).
> Since `NA` is a logical value, the subset should be changed to
> `NA_integer_` in `unsplit()`:
> 
>> unsplit
> function (value, f, drop = FALSE)
> {
>len <- length(if (is.list(f)) f[[1L]] else f)
>if (is.data.frame(value[[1L]])) {
>x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
>rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
>}
>else x <- value[[1L]][rep(NA, len)]
>split(x, f, drop = drop) <- value
>x
> }
> 
> Cheers,
> Mario
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in unsplit() with tibbles

2020-11-21 Thread Peter Dalgaard
I get the sentiment, but this is really just bad coding (on my own part, I 
suspect), so we might as well just fix it...

-pd

> On 21 Nov 2020, at 17:42 , Marc Schwartz via R-devel  
> wrote:
> 
> 
>> On Nov 21, 2020, at 10:55 AM, Mario Annau  wrote:
>> 
>> Hello,
>> 
>> using the `unsplit()` function with tibbles currently leads to the
>> following error:
>> 
>>> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
>>> s <- split(mtcars_tb, mtcars_tb$gear)
>>> unsplit(s, mtcars_tb$gear)
>> Error: Must subset rows with a valid subscript vector.
>> ℹ Logical subscripts must match the size of the indexed input.
>> x Input has size 15 but subscript `rep(NA, len)` has size 32.
>> Run `rlang::last_error()` to see where the error occurred.
>> 
>> Tibble seems to (rightly) complain, that a logical vector has been used for
>> subsetting which does not have the same length as the data.frame (rows).
>> Since `NA` is a logical value, the subset should be changed to
>> `NA_integer_` in `unsplit()`:
>> 
>>> unsplit
>> function (value, f, drop = FALSE)
>> {
>>   len <- length(if (is.list(f)) f[[1L]] else f)
>>   if (is.data.frame(value[[1L]])) {
>>   x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
>>   rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
>>   }
>>   else x <- value[[1L]][rep(NA, len)]
>>   split(x, f, drop = drop) <- value
>>   x
>> }
>> 
>> Cheers,
>> Mario
> 
> 
> Hi,
> 
> Perhaps I am missing something, but if you are using objects, like tibbles, 
> that are intended to be part of another environment, in this case the 
> tidyverse, why would you not use functions to manipulate these objects that 
> were specifically created in the other environment?
> 
> I don't use the tidyverse, but it seems to me that to expect base R functions 
> to work with objects not created in base R, is problematic, even though, 
> perhaps by coincidence, they may work without adverse effects, as appears to 
> be the case with split(). 
> 
> In other words, you should not, in reality, have had an a priori expectation 
> that split() would work with a tibble either.
> 
> Rather than modifying the base R functions, like unsplit(), as you are 
> suggesting, to be compatible with these third party objects, the burden 
> should either be on you to use relevant tidyverse functions, or on the 
> authors of the tidyverse to provide relevant class methods to provide that 
> functionality.
> 
> Regards,
> 
> Marc Schwartz
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in unsplit() with tibbles

2020-11-21 Thread Mario Annau
Cool - thank you Peter!

@Marc: This is really not a tidyverse vs base-R debate and I personally
think that they should both work together for most parts. The common
environment is still R. But just to give you the full picture I also filed
a bug for tibbles (https://github.com/tidyverse/tibble/issues/829). With
these two fixes I think that split/unsplit would work for tibbles and users
(like me) just don't have to care in which "environments" they are working
in.

Cheers,
Mario


On Sat, 21 Nov 2020 at 17:54, Peter Dalgaard  wrote:

> I get the sentiment, but this is really just bad coding (on my own part, I
> suspect), so we might as well just fix it...
>
> -pd
>
> > On 21 Nov 2020, at 17:42 , Marc Schwartz via R-devel <
> r-devel@r-project.org> wrote:
> >
> >
> >> On Nov 21, 2020, at 10:55 AM, Mario Annau 
> wrote:
> >>
> >> Hello,
> >>
> >> using the `unsplit()` function with tibbles currently leads to the
> >> following error:
> >>
> >>> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
> >>> s <- split(mtcars_tb, mtcars_tb$gear)
> >>> unsplit(s, mtcars_tb$gear)
> >> Error: Must subset rows with a valid subscript vector.
> >> ℹ Logical subscripts must match the size of the indexed input.
> >> x Input has size 15 but subscript `rep(NA, len)` has size 32.
> >> Run `rlang::last_error()` to see where the error occurred.
> >>
> >> Tibble seems to (rightly) complain, that a logical vector has been used
> for
> >> subsetting which does not have the same length as the data.frame (rows).
> >> Since `NA` is a logical value, the subset should be changed to
> >> `NA_integer_` in `unsplit()`:
> >>
> >>> unsplit
> >> function (value, f, drop = FALSE)
> >> {
> >>   len <- length(if (is.list(f)) f[[1L]] else f)
> >>   if (is.data.frame(value[[1L]])) {
> >>   x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
> >>   rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
> >>   }
> >>   else x <- value[[1L]][rep(NA, len)]
> >>   split(x, f, drop = drop) <- value
> >>   x
> >> }
> >>
> >> Cheers,
> >> Mario
> >
> >
> > Hi,
> >
> > Perhaps I am missing something, but if you are using objects, like
> tibbles, that are intended to be part of another environment, in this case
> the tidyverse, why would you not use functions to manipulate these objects
> that were specifically created in the other environment?
> >
> > I don't use the tidyverse, but it seems to me that to expect base R
> functions to work with objects not created in base R, is problematic, even
> though, perhaps by coincidence, they may work without adverse effects, as
> appears to be the case with split().
> >
> > In other words, you should not, in reality, have had an a priori
> expectation that split() would work with a tibble either.
> >
> > Rather than modifying the base R functions, like unsplit(), as you are
> suggesting, to be compatible with these third party objects, the burden
> should either be on you to use relevant tidyverse functions, or on the
> authors of the tidyverse to provide relevant class methods to provide that
> functionality.
> >
> > Regards,
> >
> > Marc Schwartz
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

-- 
Mario Annau
Founder and CEO
Quantargo

Tel: +43 1 348 44 55-11 | mario.an...@quantargo.com
www.quantargo.com

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in unsplit() with tibbles

2020-11-21 Thread Marc Schwartz via R-devel
Hi,

Peter, thanks for the clarification.


Mario, I was not looking to debate the pros and cons of each environment, 
simply to point out that expecting mutually compatible functionality is not 
generalizable, especially when third party authors can make structural changes 
to their objects over time, that can then make them incompatible with base R 
functions, even if they may be today.

That is a key basis for third party packages offering specific class methods, 
whether S3 or S4, for object classes that are unique to their packages. That 
approach provides the obvious level of transparency.

For the tidyverse folks to offer a variant of split() and unsplit() that have 
specific methods for tibbles would seem entirely reasonable, presuming that 
they don't have a philosophical barrier to doing so, in deference to other 
approaches that do conform to their preferred function syntax.

Regards,

Marc


> On Nov 21, 2020, at 12:04 PM, Mario Annau  wrote:
> 
> Cool - thank you Peter!
> 
> @Marc: This is really not a tidyverse vs base-R debate and I personally think 
> that they should both work together for most parts. The common environment is 
> still R. But just to give you the full picture I also filed a bug for tibbles 
> (https://github.com/tidyverse/tibble/issues/829 
> ). With these two fixes I 
> think that split/unsplit would work for tibbles and users (like me) just 
> don't have to care in which "environments" they are working in.
> 
> Cheers,
> Mario
> 
> 
> On Sat, 21 Nov 2020 at 17:54, Peter Dalgaard  > wrote:
> I get the sentiment, but this is really just bad coding (on my own part, I 
> suspect), so we might as well just fix it...
> 
> -pd
> 
> > On 21 Nov 2020, at 17:42 , Marc Schwartz via R-devel  > > wrote:
> > 
> > 
> >> On Nov 21, 2020, at 10:55 AM, Mario Annau  >> > wrote:
> >> 
> >> Hello,
> >> 
> >> using the `unsplit()` function with tibbles currently leads to the
> >> following error:
> >> 
> >>> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
> >>> s <- split(mtcars_tb, mtcars_tb$gear)
> >>> unsplit(s, mtcars_tb$gear)
> >> Error: Must subset rows with a valid subscript vector.
> >> ℹ Logical subscripts must match the size of the indexed input.
> >> x Input has size 15 but subscript `rep(NA, len)` has size 32.
> >> Run `rlang::last_error()` to see where the error occurred.
> >> 
> >> Tibble seems to (rightly) complain, that a logical vector has been used for
> >> subsetting which does not have the same length as the data.frame (rows).
> >> Since `NA` is a logical value, the subset should be changed to
> >> `NA_integer_` in `unsplit()`:
> >> 
> >>> unsplit
> >> function (value, f, drop = FALSE)
> >> {
> >>   len <- length(if (is.list(f)) f[[1L]] else f)
> >>   if (is.data.frame(value[[1L]])) {
> >>   x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
> >>   rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
> >>   }
> >>   else x <- value[[1L]][rep(NA, len)]
> >>   split(x, f, drop = drop) <- value
> >>   x
> >> }
> >> 
> >> Cheers,
> >> Mario
> > 
> > 
> > Hi,
> > 
> > Perhaps I am missing something, but if you are using objects, like tibbles, 
> > that are intended to be part of another environment, in this case the 
> > tidyverse, why would you not use functions to manipulate these objects that 
> > were specifically created in the other environment?
> > 
> > I don't use the tidyverse, but it seems to me that to expect base R 
> > functions to work with objects not created in base R, is problematic, even 
> > though, perhaps by coincidence, they may work without adverse effects, as 
> > appears to be the case with split(). 
> > 
> > In other words, you should not, in reality, have had an a priori 
> > expectation that split() would work with a tibble either.
> > 
> > Rather than modifying the base R functions, like unsplit(), as you are 
> > suggesting, to be compatible with these third party objects, the burden 
> > should either be on you to use relevant tidyverse functions, or on the 
> > authors of the tidyverse to provide relevant class methods to provide that 
> > functionality.
> > 
> > Regards,
> > 
> > Marc Schwartz
> > 

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] .Internal(quit(...)): system call failed: Cannot allocate memory

2020-11-21 Thread Jan Gorecki
Dear R-developers,

Some of the more fat scripts (50+ GB mem used by R) that I am running,
when they finish they do quit with q("no", status=0)
Quite often it happens that there is an extra stderr output produced
at the very end which looks like this:

Warning message:
In .Internal(quit(save, status, runLast)) :
  system call failed: Cannot allocate memory

Is there any way to avoid this kind of warnings? I am using stderr
output for detecting failures in scripts and this warning is a false
positive of a failure.

Maybe quit function could wait little bit longer trying to allocate
before it raises this warning?

Best regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] .Internal(quit(...)): system call failed: Cannot allocate memory

2020-11-21 Thread Duncan Murdoch

On 21/11/2020 12:51 p.m., Jan Gorecki wrote:

Dear R-developers,

Some of the more fat scripts (50+ GB mem used by R) that I am running,
when they finish they do quit with q("no", status=0)
Quite often it happens that there is an extra stderr output produced
at the very end which looks like this:

Warning message:
In .Internal(quit(save, status, runLast)) :
   system call failed: Cannot allocate memory

Is there any way to avoid this kind of warnings? I am using stderr
output for detecting failures in scripts and this warning is a false
positive of a failure.

Maybe quit function could wait little bit longer trying to allocate
before it raises this warning?


I don't know what waiting would accomplish.  Generally speaking the 
allocation functions in R will try garbage collection before failing, so 
it looks like you are in a situation where there really is no memory 
available.  (I think code can prevent gc; maybe your code is doing that 
and not re-enabling it?)


Having a reproducible example would help, but I imagine it's not easy to 
put one together.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Two ALTREP questions

2020-11-21 Thread luke-tierney

On Sat, 21 Nov 2020, Jiefei Wang wrote:


Hello,

I have two related ALTREP questions. It seems like there is no way to
assign attributes to an ALTREP vector without using C++ code. To be more
specifically, I want to make an ALTREP matrix, I have tried the following R
code but none of them work.
```
.Internal(inspect(1:6))
.Internal(inspect(matrix(1:6, 2,3)))
.Internal(inspect(as.matrix(1:6)))
.Internal(inspect(structure(1:6, dim = c(2L,3L
.Internal(inspect({x <- 1:6;attr(x, "dim") <- c(2L,3L);x}))
.Internal(inspect({x <- 1:6;attributes(x)<- list(dim = c(2L,3L));x}))
```


Some things that my help you:

- Try with 1:6 replaced by as.character(1:6), and look at the REF
  values in both cases.

- In particular, look at what this gives you:

x <- as.character(1:6)
attr(x, "dim") <- c(2, 3)

- Things can be a little different with larger vectors; try variants
  of your examples for more than 64 elements.


This also brings
my second question, it seems like the ALTREP coercion function does not
handle attributes correctly.  After the coercion, the ALTREP object will
lose its attributes.
```
coerceFunc <- inline::cxxfunction( signature(x = "SEXP", attr = "SEXP" ) , '
SET_ATTRIB(x,attr);
return(Rf_coerceVector(x, REALSXP));
')

coerceFunc(1:6, pairlist(dim = c(2L, 3L)))

[1] 1 2 3 4 5 6

coerceFunc(1:6 + 0L, pairlist(dim = c(2L, 3L)))

[,1] [,2] [,3]
[1,]135
[2,]246
```
The problem is that the coercion function is directly dispatched to the
user-defined ALTREP coercion function, so the user is responsible to attach
the attributes after the coercion. If he forgets to do so, then the result
is a plain vector. Similar to the `Duplicate` and `DuplicateEX` functions
where the former one will attach the attributes by default, I feel that the
`Coerce` function should only return a plain vector and there should be a
`CoerceEx` function to do the attribute assignment, so the logic in the
no-EX ALTREP functions can be consistent. I do not know how dramastic the
change would be, so maybe this is too hard to do.


Since you raised this earlier I have been looking at it and also think
that this needs to he handled along the lines of
Duplicate/DuplicateEx. I need to find some time to think that through
and implement it; hopefully I'll get to it before the end of the year.


BTW, is there any way to contribute to the R source? I know R has a limited
resouces, so if possible, I will be happy to fix the matrix issue myself
and make some minor contributions to the R community.


You can find the suggested process for contributing described in the
'Reporting Bugs' link on the R home page https://www.r-project.org/

Best,

luke


Best,
Jiefei

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Two ALTREP questions

2020-11-21 Thread Jiefei Wang
Thank Dirk and Luke for the answers!

(That's C code. The confusion here is partly our fault. When Romain and I
> extended the inline package with 'cxxfunction' to support the then-young
> but
> active Rcpp package, we picked C++. Strictly speaking that isn't required;
> you are only in C++ here because ... "it made sense to us 10 years ago" and
> it could generalized to C++ or C.  All ALTREP is, of course, purely C as it
> is an R API.)


Sometimes I forget to distinguish C/C++ code. Yes, this should be C code
and it is just C++ compatible.
Anyway, `inline` is a great package, `cxxfunction` makes life much easier
for reporting the low-level problem to the devel team.

- Try with 1:6 replaced by as.character(1:6), and look at the REF
>values in both cases.
> - In particular, look at what this gives you:
>  x <- as.character(1:6)
>  attr(x, "dim") <- c(2, 3)
> - Things can be a little different with larger vectors; try variants
>of your examples for more than 64 elements.


I see why we cannot change the attribute of the compact sequence(It is
shared, or at least marked as shared)
and the use of the wrapper for the large vector. Only the matrix function
needs to be patched.

The generally recommended way is via a bug report at bugs.r-project.org


You can find the suggested process for contributing described in the
> 'Reporting Bugs' link on the R home page https://www.r-project.org/


Bugzilla sounds like a good place to start, I will send an email to acquire
an account.

Best,
Jiefei


On Sun, Nov 22, 2020 at 6:57 AM  wrote:

> On Sat, 21 Nov 2020, Jiefei Wang wrote:
>
> > Hello,
> >
> > I have two related ALTREP questions. It seems like there is no way to
> > assign attributes to an ALTREP vector without using C++ code. To be more
> > specifically, I want to make an ALTREP matrix, I have tried the
> following R
> > code but none of them work.
> > ```
> > .Internal(inspect(1:6))
> > .Internal(inspect(matrix(1:6, 2,3)))
> > .Internal(inspect(as.matrix(1:6)))
> > .Internal(inspect(structure(1:6, dim = c(2L,3L
> > .Internal(inspect({x <- 1:6;attr(x, "dim") <- c(2L,3L);x}))
> > .Internal(inspect({x <- 1:6;attributes(x)<- list(dim = c(2L,3L));x}))
> > ```
>
> Some things that my help you:
>
> - Try with 1:6 replaced by as.character(1:6), and look at the REF
>values in both cases.
>
> - In particular, look at what this gives you:
>
>  x <- as.character(1:6)
>  attr(x, "dim") <- c(2, 3)
>
> - Things can be a little different with larger vectors; try variants
>of your examples for more than 64 elements.
>
> > This also brings
> > my second question, it seems like the ALTREP coercion function does not
> > handle attributes correctly.  After the coercion, the ALTREP object will
> > lose its attributes.
> > ```
> > coerceFunc <- inline::cxxfunction( signature(x = "SEXP", attr = "SEXP" )
> , '
> > SET_ATTRIB(x,attr);
> > return(Rf_coerceVector(x, REALSXP));
> > ')
> >> coerceFunc(1:6, pairlist(dim = c(2L, 3L)))
> > [1] 1 2 3 4 5 6
> >> coerceFunc(1:6 + 0L, pairlist(dim = c(2L, 3L)))
> > [,1] [,2] [,3]
> > [1,]135
> > [2,]246
> > ```
> > The problem is that the coercion function is directly dispatched to the
> > user-defined ALTREP coercion function, so the user is responsible to
> attach
> > the attributes after the coercion. If he forgets to do so, then the
> result
> > is a plain vector. Similar to the `Duplicate` and `DuplicateEX` functions
> > where the former one will attach the attributes by default, I feel that
> the
> > `Coerce` function should only return a plain vector and there should be a
> > `CoerceEx` function to do the attribute assignment, so the logic in the
> > no-EX ALTREP functions can be consistent. I do not know how dramastic the
> > change would be, so maybe this is too hard to do.
>
> Since you raised this earlier I have been looking at it and also think
> that this needs to he handled along the lines of
> Duplicate/DuplicateEx. I need to find some time to think that through
> and implement it; hopefully I'll get to it before the end of the year.
>
> > BTW, is there any way to contribute to the R source? I know R has a
> limited
> > resouces, so if possible, I will be happy to fix the matrix issue myself
> > and make some minor contributions to the R community.
>
> You can find the suggested process for contributing described in the
> 'Reporting Bugs' link on the R home page https://www.r-project.org/
>
> Best,
>
> luke
>
> > Best,
> > Jiefei
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier..