Re: [Rd] surprised matrix (1:256, 8, 8) doesn't cause error/warning

2021-02-09 Thread Martin Maechler
> Wolfgang Huber 
> on Sat, 6 Feb 2021 19:49:11 +0100 writes:

> FWIW, I paste below a possible change to the warnings generating part of 
the do_matrix function in R/src/main/array.c that adds the kind of warning that 
Abby is asking for, and that IMHO would more often help users find bugs in 
their code than interfere with intended behaviour.

Thank you, Wolfgang.
Honestly,  I had originally not wanted to get into this.

Functions that have been in use for longer than R exists (namely
in S / S+) without the need for changes  is not something we'd
typically easily consider for a change.
{well, in the very old times,  0-extent matrices did not exist,
 so there were *some* changes during matrix()'s history ..}

But I think Abby and you have a point here so I have been
looking at your patch .. and from code reading had wondered
about another  behavior which *was* not quite consistent,
and you eliminated completely with your patch:

op <- options(warn=1)
for(n in 0:2) { cat(n,":\n") ; print(matrix(seq_len(n), 0, 0)) }
options(op)

shows (in released R)

0 :
<0 x 0 matrix>
1 :
<0 x 0 matrix>
2 :
Warning in matrix(seq_len(n), 0, 0) :
  data length exceeds size of matrix
<0 x 0 matrix>
> 

and it is really seems not logical that in matrix(x, 0,0),
 'x' of  size 1 and  size (>=) 2  are treated differently.
For consistency,  size 1  "should"  also warn (but read on!)

After your patch, theres' no warning in any case .. which is
consistent, within the  matrix(x, 0,0) situation  but not consistent
with your proposal to warn more often when  matrix(x, n,k)  is
"obviously" using inconsistent dimensions.

When trying to let   matrix(1, 0,0)  also warn,
I quickly found that this produces many new warnings in our (R
base packages) examples and tests, notably from construction
such as

 matrix(NA, 0, 3)
or
 matrix(NA_character_, 0, 4)

which would all have to be re-written as

  matrix(logical()  , 0, 3)
ormatrix(character(), 0, 4)

The latter *is* more strictly self-consistent, but do we really
want to impose such strictness ?

My conclusion:  Not at this moment

OTOH, I'd re-add the warning for  length(x) > 1  which has
been there in the current code (but not your patch).
{{No need to send another patch, I've changed too many small
  things already, for me to be useful}}

This still needs adaption of one of the regression tests of R
itself, and needs (at least) one of the tests of the Matrix package
(warning turned into error, from options(warn = 2)).

I'm willing to go that route, but I'm sure this will entail some
work by other package authors, too (and hence CRAN maintainers etc).

Opinions?


>> matrix (1:6, nrow = 2, ncol = 3)

>> matrix (1:12, nrow = 2, ncol = 3)
> [,1] [,2] [,3]
> [1,]135
> [2,]246
> Warning message:
> In matrix(1:12, nrow = 2, ncol = 3) :
> data length incompatible with size of matrix

>> matrix (1:7, nrow = 2, ncol = 3)
> Warning messages:
> 1: In matrix(1:7, nrow = 2, ncol = 3) :
> data length [7] is not a sub-multiple or multiple of the number of rows 
[2]
> 2: In matrix(1:7, nrow = 2, ncol = 3) :
> data length incompatible with size of matrix

>> matrix (1:8, nrow = 2, ncol = 3)
> Warning messages:
> 1: In matrix(1:8, nrow = 2, ncol = 3) :
> data length [8] is not a sub-multiple or multiple of the number of 
columns [3]
> 2: In matrix(1:8, nrow = 2, ncol = 3) :
> data length incompatible with size of matrix

>> matrix (1:6, nrow = 0, ncol = 0)
> <0 x 0 matrix>
>> matrix (numeric(0), nrow = 2, ncol = 3)
> [,1] [,2] [,3]
> [1,]   NA   NA   NA
> [2,]   NA   NA   NA

>> matrix(1:2, ncol = 8)
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
> [1,]12121212

> It would be nice to combine the new warning with that about “...not a 
sub-multiple or multiple…” into a single warning, if appropriate (as in two of 
the examples above), but that would require bigger surgery way above my 
payscale.

I agree that in those cases it should only show one warning, and
to keep things simple I'd say it should just be the 2nd of those
above  or more precisely (I have to check if that's correct)

  "data length larger than size of matrix"

Martin

> Kind regards
> Wolfgang Huber


> Index: array.c
> ===
> --- array.c   (revision 79951)
> +++ array.c   (working copy)
> @@ -133,18 +133,19 @@
> nc = (int) ceil((double) lendat / (double) nr);
> }

> -if(lendat > 0) {
> +if (lendat > 1) {
> R_xlen_t nrc = (R_xlen_t) nr * nc;
> - if (lendat > 1 && nrc % lendat != 0) {
> + if ((nrc % lendat) != 0) {
> if (((lendat > nr) && (lendat / nr) * nr != lendat) ||
> ((lendat < nr) && (nr / lendat) * lendat != nr))
> warning(_("data length [%d] is not a sub-multiple or multiple 

Re: [Rd] surprised matrix (1:256, 8, 8) doesn't cause error/warning

2021-02-09 Thread Wolfgang Huber
Hi Martin

Thank you! I very much understand your reservations and know it was a bit 
cheeky to poke.

I agree that in those cases where my (naive) patch results in two warnings, 
keeping only the new one would better.
No strong opinion about the case where either ncol or nrow is 0. Maybe a 
compromise would be to live with the inconsistency of only warning if 
length(data)>1, in order to allow legacy code such as  matrix(NA, 0, 3). 
Assuming that the main objective is not consistency but more robust users’ R 
code.

Btw if one were to refactor this properly (which I do not propose!), shouldn’t 
`matrix` just be a wrapper to `array`, whose added value is the inference of 
missing `nrow` and `ncol` values and dealing with `byrow`?

Kind regards, and thanks again  
Wolfgang


> Il giorno 9feb2021, alle ore 11:58, Martin Maechler 
>  ha scritto:
> 
>> Wolfgang Huber 
>>on Sat, 6 Feb 2021 19:49:11 +0100 writes:
> 
>> FWIW, I paste below a possible change to the warnings generating part of the 
>> do_matrix function in R/src/main/array.c that adds the kind of warning that 
>> Abby is asking for, and that IMHO would more often help users find bugs in 
>> their code than interfere with intended behaviour.
> 
> Thank you, Wolfgang.
> Honestly,  I had originally not wanted to get into this.
> 
> Functions that have been in use for longer than R exists (namely
> in S / S+) without the need for changes  is not something we'd
> typically easily consider for a change.
> {well, in the very old times,  0-extent matrices did not exist,
> so there were *some* changes during matrix()'s history ..}
> 
> But I think Abby and you have a point here so I have been
> looking at your patch .. and from code reading had wondered
> about another  behavior which *was* not quite consistent,
> and you eliminated completely with your patch:
> 
> op <- options(warn=1)
> for(n in 0:2) { cat(n,":\n") ; print(matrix(seq_len(n), 0, 0)) }
> options(op)
> 
> shows (in released R)
> 
> 0 :
> <0 x 0 matrix>
> 1 :
> <0 x 0 matrix>
> 2 :
> Warning in matrix(seq_len(n), 0, 0) :
>  data length exceeds size of matrix
> <0 x 0 matrix>
>> 
> 
> and it is really seems not logical that in matrix(x, 0,0),
> 'x' of  size 1 and  size (>=) 2  are treated differently.
> For consistency,  size 1  "should"  also warn (but read on!)
> 
> After your patch, theres' no warning in any case .. which is
> consistent, within the  matrix(x, 0,0) situation  but not consistent
> with your proposal to warn more often when  matrix(x, n,k)  is
> "obviously" using inconsistent dimensions.
> 
> When trying to let   matrix(1, 0,0)  also warn,
> I quickly found that this produces many new warnings in our (R
> base packages) examples and tests, notably from construction
> such as
> 
>matrix(NA, 0, 3)
> or
>matrix(NA_character_, 0, 4)
> 
> which would all have to be re-written as
> 
>  matrix(logical()  , 0, 3)
> ormatrix(character(), 0, 4)
> 
> The latter *is* more strictly self-consistent, but do we really
> want to impose such strictness ?
> 
> My conclusion:  Not at this moment
> 
> OTOH, I'd re-add the warning for  length(x) > 1  which has
> been there in the current code (but not your patch).
> {{No need to send another patch, I've changed too many small
>  things already, for me to be useful}}
> 
> This still needs adaption of one of the regression tests of R
> itself, and needs (at least) one of the tests of the Matrix package
> (warning turned into error, from options(warn = 2)).
> 
> I'm willing to go that route, but I'm sure this will entail some
> work by other package authors, too (and hence CRAN maintainers etc).
> 
> Opinions?
> 
> 
>>> matrix (1:6, nrow = 2, ncol = 3)
> 
>>> matrix (1:12, nrow = 2, ncol = 3)
>> [,1] [,2] [,3]
>> [1,]135
>> [2,]246
>> Warning message:
>> In matrix(1:12, nrow = 2, ncol = 3) :
>> data length incompatible with size of matrix
> 
>>> matrix (1:7, nrow = 2, ncol = 3)
>> Warning messages:
>> 1: In matrix(1:7, nrow = 2, ncol = 3) :
>> data length [7] is not a sub-multiple or multiple of the number of rows [2]
>> 2: In matrix(1:7, nrow = 2, ncol = 3) :
>> data length incompatible with size of matrix
> 
>>> matrix (1:8, nrow = 2, ncol = 3)
>> Warning messages:
>> 1: In matrix(1:8, nrow = 2, ncol = 3) :
>> data length [8] is not a sub-multiple or multiple of the number of columns 
>> [3]
>> 2: In matrix(1:8, nrow = 2, ncol = 3) :
>> data length incompatible with size of matrix
> 
>>> matrix (1:6, nrow = 0, ncol = 0)
>> <0 x 0 matrix>
>>> matrix (numeric(0), nrow = 2, ncol = 3)
>> [,1] [,2] [,3]
>> [1,]   NA   NA   NA
>> [2,]   NA   NA   NA
> 
>>> matrix(1:2, ncol = 8)
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
>> [1,]12121212
> 
>> It would be nice to combine the new warning with that about “...not a 
>> sub-multiple or multiple…” into a single warning, if appropriate (as in two 
>> of the examples above), but that would requi

Re: [Rd] From .Fortran to .Call?

2021-02-09 Thread Avraham Adler
I had some time, so I updated a toy package I have for explaining R
and Fortran use to use both the .Call and the .Fortran interfaces [1].
I think the actual Fortran code is as close to identical as I can
reasonably make it. On my computer, the .Call interface (_f) is around
4 times as fast as the .Fortran interface (_f2).

Bill, I don't know if you can, or should, "just" change .Fortran to
.Call. You certainly cannot do the reverse. I think my source code
made as minimal a change as possible; maybe that would help.

---
set.seed(77)
A <- runif(100, 0, 2000)
microbenchmark(LLC_f(A, 500, 500), LLC_f2(A, 500, 500), times =
1L, control = list(order = 'block'), check = 'equal')
Unit: nanoseconds
expr  min   lq  mean median   uq   max neval cld
  LLC_f(A, 500, 500)  700  702  799.5906801  802  6601 1  a
 LLC_f2(A, 500, 500) 3000 3101 3328.8712   3201 3401 19802 1   b


Thanks,

Avi

[1] https://github.com/aadler/SimpFort


On Sat, Dec 26, 2020 at 10:48 PM Avraham Adler  wrote:
>
> I’ve tried recoding some of Delaporte to use the .Fortran interface and I 
> don’t know what I’m doing wrong but it either doesn’t work or crashes my R 
> instance completely.
>
> Avi
>
> On Sat, Dec 26, 2020 at 11:48 AM Koenker, Roger W  
> wrote:
>>
>> I’ve recoded a version of one of my quantile regression fitting functions to 
>> use .C64 from dotCall64 rather than .Fortran.
>> For a moderately large problem with n = 500,000 and p = 5, and solving for  
>> 1:49/50 quantiles the new version shows
>> a 3% speedup, although for smaller problems it is actually slower that the 
>> .Fortran version.  So, I’m (provisionally)
>> unimpressed by the claims that .Fortran has a big “overhead” performance 
>> penalty.  Compared to the(more than) an order of
>> magnitude (base 10) improvement that moving from R to fortran produces,  3% 
>> isn’t really worth the (admittedly) minimal
>> additional coding effort.
>>
>> > On Dec 24, 2020, at 12:39 AM, Balasubramanian Narasimhan 
>> >  wrote:
>> >
>> > Also, just came to know about dotcall64::.C64() (on CRAN) which allows for 
>> > Fortran to be called using .Call().
>> >
>> > -Naras
>> >
>> > On 12/23/20 8:34 AM, Balasubramanian Narasimhan wrote:
>> >> I think it should be pretty easy to fix up SUtools to use the .Call 
>> >> instead of .Fortran following along the lines of
>> >>
>> >> https://urldefense.com/v3/__https://github.com/wrathematics/Romp__;!!DZ3fjg!r3_sswU4ZHCe3huoGUy2boX-Vr7aUS-RaExyeh_Rsv8gvGiABcqzvOOKZinG4kC7RtA$
>> >> I too deal with a lot of f77 and so I will most likely finish it before 
>> >> the new year, if not earlier. (Would welcome testers besides myself.)
>> >>
>> >> Incidentally, any idea of what the performance hit is, quantitatively? I 
>> >> confess I never paid attention to it myself as most Fortran code I use 
>> >> seems pretty fast, i.e. glmnet.
>> >>
>> >> -Naras
>> >>
>> >>
>> >> On 12/23/20 3:57 AM, Koenker, Roger W wrote:
>> >>> Thanks to all and best wishes for a better 2021.
>> >>>
>> >>> Unfortunately I remain somewhat confused:
>> >>>
>> >>> o  Bill reveals an elegant way to get from my rudimentary 
>> >>> registration setup to
>> >>> one that would explicitly type the C interface functions,
>> >>>
>> >>> o Ivan seems to suggest that there would be no performance gain from 
>> >>> doing this.
>> >>>
>> >>> o  Naras’s pcLasso package does use the explicit C typing, but then 
>> >>> uses .Fortran
>> >>> not .Call.
>> >>>
>> >>> o  Avi uses .Call and cites the Romp package 
>> >>> https://urldefense.com/v3/__https://github.com/wrathematics/Romp__;!!DZ3fjg!r3_sswU4ZHCe3huoGUy2boX-Vr7aUS-RaExyeh_Rsv8gvGiABcqzvOOKZinG4kC7RtA$
>> >>>  where it is asserted that "there is a (nearly) deprecated interface 
>> >>> .Fortran() which you
>> >>> should not use due to its large performance overhead.”
>> >>>
>> >>> As the proverbial naive R (ab)user I’m left wondering:
>> >>>
>> >>> o  if I updated my quantreg_init.c file in accordance with Bill’s 
>> >>> suggestion could I
>> >>> then simply change my .Fortran calls to .Call?
>> >>>
>> >>> o  and if so, do I need to include ALL the fortran subroutines in my 
>> >>> src directory
>> >>> or only the ones called from R?
>> >>>
>> >>> o  and in either case could I really expect to see a significant 
>> >>> performance gain?
>> >>>
>> >>> Finally, perhaps I should stipulate that my fortran is strictly f77, so 
>> >>> no modern features
>> >>> are in play, indeed most of the code is originally written in ratfor, 
>> >>> Brian Kernighan’s
>> >>> dialect from ancient times at Bell Labs.
>> >>>
>> >>> Again,  thanks to all for any advice,
>> >>> Roger
>> >>>
>> >>>
>>  On Dec 23, 2020, at 1:11 AM, Avraham Adler  
>>  wrote:
>> 
>>  Hello, Ivan.
>> 
>>  I used .Call instead of .Fortran in the Delaporte package [1]. What
>>  helped me out a lot was Drew Schmidt's Romp examples and descriptions
>>  [2]. If