Re: [Rd] surprised matrix (1:256, 8, 8) doesn't cause error/warning
> Wolfgang Huber > on Sat, 6 Feb 2021 19:49:11 +0100 writes: > FWIW, I paste below a possible change to the warnings generating part of the do_matrix function in R/src/main/array.c that adds the kind of warning that Abby is asking for, and that IMHO would more often help users find bugs in their code than interfere with intended behaviour. Thank you, Wolfgang. Honestly, I had originally not wanted to get into this. Functions that have been in use for longer than R exists (namely in S / S+) without the need for changes is not something we'd typically easily consider for a change. {well, in the very old times, 0-extent matrices did not exist, so there were *some* changes during matrix()'s history ..} But I think Abby and you have a point here so I have been looking at your patch .. and from code reading had wondered about another behavior which *was* not quite consistent, and you eliminated completely with your patch: op <- options(warn=1) for(n in 0:2) { cat(n,":\n") ; print(matrix(seq_len(n), 0, 0)) } options(op) shows (in released R) 0 : <0 x 0 matrix> 1 : <0 x 0 matrix> 2 : Warning in matrix(seq_len(n), 0, 0) : data length exceeds size of matrix <0 x 0 matrix> > and it is really seems not logical that in matrix(x, 0,0), 'x' of size 1 and size (>=) 2 are treated differently. For consistency, size 1 "should" also warn (but read on!) After your patch, theres' no warning in any case .. which is consistent, within the matrix(x, 0,0) situation but not consistent with your proposal to warn more often when matrix(x, n,k) is "obviously" using inconsistent dimensions. When trying to let matrix(1, 0,0) also warn, I quickly found that this produces many new warnings in our (R base packages) examples and tests, notably from construction such as matrix(NA, 0, 3) or matrix(NA_character_, 0, 4) which would all have to be re-written as matrix(logical() , 0, 3) ormatrix(character(), 0, 4) The latter *is* more strictly self-consistent, but do we really want to impose such strictness ? My conclusion: Not at this moment OTOH, I'd re-add the warning for length(x) > 1 which has been there in the current code (but not your patch). {{No need to send another patch, I've changed too many small things already, for me to be useful}} This still needs adaption of one of the regression tests of R itself, and needs (at least) one of the tests of the Matrix package (warning turned into error, from options(warn = 2)). I'm willing to go that route, but I'm sure this will entail some work by other package authors, too (and hence CRAN maintainers etc). Opinions? >> matrix (1:6, nrow = 2, ncol = 3) >> matrix (1:12, nrow = 2, ncol = 3) > [,1] [,2] [,3] > [1,]135 > [2,]246 > Warning message: > In matrix(1:12, nrow = 2, ncol = 3) : > data length incompatible with size of matrix >> matrix (1:7, nrow = 2, ncol = 3) > Warning messages: > 1: In matrix(1:7, nrow = 2, ncol = 3) : > data length [7] is not a sub-multiple or multiple of the number of rows [2] > 2: In matrix(1:7, nrow = 2, ncol = 3) : > data length incompatible with size of matrix >> matrix (1:8, nrow = 2, ncol = 3) > Warning messages: > 1: In matrix(1:8, nrow = 2, ncol = 3) : > data length [8] is not a sub-multiple or multiple of the number of columns [3] > 2: In matrix(1:8, nrow = 2, ncol = 3) : > data length incompatible with size of matrix >> matrix (1:6, nrow = 0, ncol = 0) > <0 x 0 matrix> >> matrix (numeric(0), nrow = 2, ncol = 3) > [,1] [,2] [,3] > [1,] NA NA NA > [2,] NA NA NA >> matrix(1:2, ncol = 8) > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,]12121212 > It would be nice to combine the new warning with that about “...not a sub-multiple or multiple…” into a single warning, if appropriate (as in two of the examples above), but that would require bigger surgery way above my payscale. I agree that in those cases it should only show one warning, and to keep things simple I'd say it should just be the 2nd of those above or more precisely (I have to check if that's correct) "data length larger than size of matrix" Martin > Kind regards > Wolfgang Huber > Index: array.c > === > --- array.c (revision 79951) > +++ array.c (working copy) > @@ -133,18 +133,19 @@ > nc = (int) ceil((double) lendat / (double) nr); > } > -if(lendat > 0) { > +if (lendat > 1) { > R_xlen_t nrc = (R_xlen_t) nr * nc; > - if (lendat > 1 && nrc % lendat != 0) { > + if ((nrc % lendat) != 0) { > if (((lendat > nr) && (lendat / nr) * nr != lendat) || > ((lendat < nr) && (nr / lendat) * lendat != nr)) > warning(_("data length [%d] is not a sub-multiple or multiple
Re: [Rd] surprised matrix (1:256, 8, 8) doesn't cause error/warning
Hi Martin Thank you! I very much understand your reservations and know it was a bit cheeky to poke. I agree that in those cases where my (naive) patch results in two warnings, keeping only the new one would better. No strong opinion about the case where either ncol or nrow is 0. Maybe a compromise would be to live with the inconsistency of only warning if length(data)>1, in order to allow legacy code such as matrix(NA, 0, 3). Assuming that the main objective is not consistency but more robust users’ R code. Btw if one were to refactor this properly (which I do not propose!), shouldn’t `matrix` just be a wrapper to `array`, whose added value is the inference of missing `nrow` and `ncol` values and dealing with `byrow`? Kind regards, and thanks again Wolfgang > Il giorno 9feb2021, alle ore 11:58, Martin Maechler > ha scritto: > >> Wolfgang Huber >>on Sat, 6 Feb 2021 19:49:11 +0100 writes: > >> FWIW, I paste below a possible change to the warnings generating part of the >> do_matrix function in R/src/main/array.c that adds the kind of warning that >> Abby is asking for, and that IMHO would more often help users find bugs in >> their code than interfere with intended behaviour. > > Thank you, Wolfgang. > Honestly, I had originally not wanted to get into this. > > Functions that have been in use for longer than R exists (namely > in S / S+) without the need for changes is not something we'd > typically easily consider for a change. > {well, in the very old times, 0-extent matrices did not exist, > so there were *some* changes during matrix()'s history ..} > > But I think Abby and you have a point here so I have been > looking at your patch .. and from code reading had wondered > about another behavior which *was* not quite consistent, > and you eliminated completely with your patch: > > op <- options(warn=1) > for(n in 0:2) { cat(n,":\n") ; print(matrix(seq_len(n), 0, 0)) } > options(op) > > shows (in released R) > > 0 : > <0 x 0 matrix> > 1 : > <0 x 0 matrix> > 2 : > Warning in matrix(seq_len(n), 0, 0) : > data length exceeds size of matrix > <0 x 0 matrix> >> > > and it is really seems not logical that in matrix(x, 0,0), > 'x' of size 1 and size (>=) 2 are treated differently. > For consistency, size 1 "should" also warn (but read on!) > > After your patch, theres' no warning in any case .. which is > consistent, within the matrix(x, 0,0) situation but not consistent > with your proposal to warn more often when matrix(x, n,k) is > "obviously" using inconsistent dimensions. > > When trying to let matrix(1, 0,0) also warn, > I quickly found that this produces many new warnings in our (R > base packages) examples and tests, notably from construction > such as > >matrix(NA, 0, 3) > or >matrix(NA_character_, 0, 4) > > which would all have to be re-written as > > matrix(logical() , 0, 3) > ormatrix(character(), 0, 4) > > The latter *is* more strictly self-consistent, but do we really > want to impose such strictness ? > > My conclusion: Not at this moment > > OTOH, I'd re-add the warning for length(x) > 1 which has > been there in the current code (but not your patch). > {{No need to send another patch, I've changed too many small > things already, for me to be useful}} > > This still needs adaption of one of the regression tests of R > itself, and needs (at least) one of the tests of the Matrix package > (warning turned into error, from options(warn = 2)). > > I'm willing to go that route, but I'm sure this will entail some > work by other package authors, too (and hence CRAN maintainers etc). > > Opinions? > > >>> matrix (1:6, nrow = 2, ncol = 3) > >>> matrix (1:12, nrow = 2, ncol = 3) >> [,1] [,2] [,3] >> [1,]135 >> [2,]246 >> Warning message: >> In matrix(1:12, nrow = 2, ncol = 3) : >> data length incompatible with size of matrix > >>> matrix (1:7, nrow = 2, ncol = 3) >> Warning messages: >> 1: In matrix(1:7, nrow = 2, ncol = 3) : >> data length [7] is not a sub-multiple or multiple of the number of rows [2] >> 2: In matrix(1:7, nrow = 2, ncol = 3) : >> data length incompatible with size of matrix > >>> matrix (1:8, nrow = 2, ncol = 3) >> Warning messages: >> 1: In matrix(1:8, nrow = 2, ncol = 3) : >> data length [8] is not a sub-multiple or multiple of the number of columns >> [3] >> 2: In matrix(1:8, nrow = 2, ncol = 3) : >> data length incompatible with size of matrix > >>> matrix (1:6, nrow = 0, ncol = 0) >> <0 x 0 matrix> >>> matrix (numeric(0), nrow = 2, ncol = 3) >> [,1] [,2] [,3] >> [1,] NA NA NA >> [2,] NA NA NA > >>> matrix(1:2, ncol = 8) >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] >> [1,]12121212 > >> It would be nice to combine the new warning with that about “...not a >> sub-multiple or multiple…” into a single warning, if appropriate (as in two >> of the examples above), but that would requi
Re: [Rd] From .Fortran to .Call?
I had some time, so I updated a toy package I have for explaining R and Fortran use to use both the .Call and the .Fortran interfaces [1]. I think the actual Fortran code is as close to identical as I can reasonably make it. On my computer, the .Call interface (_f) is around 4 times as fast as the .Fortran interface (_f2). Bill, I don't know if you can, or should, "just" change .Fortran to .Call. You certainly cannot do the reverse. I think my source code made as minimal a change as possible; maybe that would help. --- set.seed(77) A <- runif(100, 0, 2000) microbenchmark(LLC_f(A, 500, 500), LLC_f2(A, 500, 500), times = 1L, control = list(order = 'block'), check = 'equal') Unit: nanoseconds expr min lq mean median uq max neval cld LLC_f(A, 500, 500) 700 702 799.5906801 802 6601 1 a LLC_f2(A, 500, 500) 3000 3101 3328.8712 3201 3401 19802 1 b Thanks, Avi [1] https://github.com/aadler/SimpFort On Sat, Dec 26, 2020 at 10:48 PM Avraham Adler wrote: > > I’ve tried recoding some of Delaporte to use the .Fortran interface and I > don’t know what I’m doing wrong but it either doesn’t work or crashes my R > instance completely. > > Avi > > On Sat, Dec 26, 2020 at 11:48 AM Koenker, Roger W > wrote: >> >> I’ve recoded a version of one of my quantile regression fitting functions to >> use .C64 from dotCall64 rather than .Fortran. >> For a moderately large problem with n = 500,000 and p = 5, and solving for >> 1:49/50 quantiles the new version shows >> a 3% speedup, although for smaller problems it is actually slower that the >> .Fortran version. So, I’m (provisionally) >> unimpressed by the claims that .Fortran has a big “overhead” performance >> penalty. Compared to the(more than) an order of >> magnitude (base 10) improvement that moving from R to fortran produces, 3% >> isn’t really worth the (admittedly) minimal >> additional coding effort. >> >> > On Dec 24, 2020, at 12:39 AM, Balasubramanian Narasimhan >> > wrote: >> > >> > Also, just came to know about dotcall64::.C64() (on CRAN) which allows for >> > Fortran to be called using .Call(). >> > >> > -Naras >> > >> > On 12/23/20 8:34 AM, Balasubramanian Narasimhan wrote: >> >> I think it should be pretty easy to fix up SUtools to use the .Call >> >> instead of .Fortran following along the lines of >> >> >> >> https://urldefense.com/v3/__https://github.com/wrathematics/Romp__;!!DZ3fjg!r3_sswU4ZHCe3huoGUy2boX-Vr7aUS-RaExyeh_Rsv8gvGiABcqzvOOKZinG4kC7RtA$ >> >> I too deal with a lot of f77 and so I will most likely finish it before >> >> the new year, if not earlier. (Would welcome testers besides myself.) >> >> >> >> Incidentally, any idea of what the performance hit is, quantitatively? I >> >> confess I never paid attention to it myself as most Fortran code I use >> >> seems pretty fast, i.e. glmnet. >> >> >> >> -Naras >> >> >> >> >> >> On 12/23/20 3:57 AM, Koenker, Roger W wrote: >> >>> Thanks to all and best wishes for a better 2021. >> >>> >> >>> Unfortunately I remain somewhat confused: >> >>> >> >>> o Bill reveals an elegant way to get from my rudimentary >> >>> registration setup to >> >>> one that would explicitly type the C interface functions, >> >>> >> >>> o Ivan seems to suggest that there would be no performance gain from >> >>> doing this. >> >>> >> >>> o Naras’s pcLasso package does use the explicit C typing, but then >> >>> uses .Fortran >> >>> not .Call. >> >>> >> >>> o Avi uses .Call and cites the Romp package >> >>> https://urldefense.com/v3/__https://github.com/wrathematics/Romp__;!!DZ3fjg!r3_sswU4ZHCe3huoGUy2boX-Vr7aUS-RaExyeh_Rsv8gvGiABcqzvOOKZinG4kC7RtA$ >> >>> where it is asserted that "there is a (nearly) deprecated interface >> >>> .Fortran() which you >> >>> should not use due to its large performance overhead.” >> >>> >> >>> As the proverbial naive R (ab)user I’m left wondering: >> >>> >> >>> o if I updated my quantreg_init.c file in accordance with Bill’s >> >>> suggestion could I >> >>> then simply change my .Fortran calls to .Call? >> >>> >> >>> o and if so, do I need to include ALL the fortran subroutines in my >> >>> src directory >> >>> or only the ones called from R? >> >>> >> >>> o and in either case could I really expect to see a significant >> >>> performance gain? >> >>> >> >>> Finally, perhaps I should stipulate that my fortran is strictly f77, so >> >>> no modern features >> >>> are in play, indeed most of the code is originally written in ratfor, >> >>> Brian Kernighan’s >> >>> dialect from ancient times at Bell Labs. >> >>> >> >>> Again, thanks to all for any advice, >> >>> Roger >> >>> >> >>> >> On Dec 23, 2020, at 1:11 AM, Avraham Adler >> wrote: >> >> Hello, Ivan. >> >> I used .Call instead of .Fortran in the Delaporte package [1]. What >> helped me out a lot was Drew Schmidt's Romp examples and descriptions >> [2]. If