Re: [Rd] RFC: Matrix package: Matrix products (%*%, crossprod, tcrossprod) involving "nsparseMatrix" aka sparse pattern matrices
We don't use the pattern matrices, nevertheless the proposed changes sound good to me. I particularly like the suggestion to treat the matrices as numeric by default, but provide simple ways to use boolean arithmetic instead - this means that developers have access to both forms of arithmetic and it will be more obvious from the code which arithmetic is being used. Best wishes, Heather On Thu, Mar 19, 2015, at 10:02 PM, Martin Maechler wrote: > This is a Request For Comment, also BCCed to 390 package maintainers > of reverse dependencies of the Matrix package. > > Most users and package authors working with our 'Matrix' package will > be using it for numerical computations, and so will be using > "dMatrix" (d : double precision) matrix objects M, and indirectly, > e.g., for > M >= c will also use "lMatrix" (l: logical i.e. TRUE/FALSE/NA). > All the following is **not** affecting those numerical / logical > computations. > > A few others will know that we also have "pattern" matrices (purely > binary: TRUE/FALSE, no NA) notably sparse ones, those "ngCMatrix" etc, > all starting with "n" (from ``patter[n]``) which do play a prominent > role in the internal sparse matrix algorithms, notably of the > (underlying C code) CHOLMOD library in the so-called "symbolic" > cholesky decomposition and other such operations. Another reason you > may use them because they are equivalent to incidence matrices of > unweighted (directed or undirected) graphs. > > Now, as the subject says, I'm bringing up the topic of what should > happen when these matrices appear in matrix multiplications. > Somewhat by design, but also partly by coincidence, the *sparse* > pattern matrices multiplication in the Matrix package mostly builds on > the CHOLMOD library `cholmod_ssmult()` function which implements > "Boolean arithmetic" for them, instead of regular arithmetic: > "+" is logical "or" > "*" is logical "and". > Once we map TRUE <-> 1 and FALSE <-> 0, the only difference between > boolean and regular arithmetic is that "1+1 = 1" in the (mapped) > boolean arithmetic, because "TRUE | TRUE" is TRUE in original logic. > > The drawback of using the boolean arithmetic here is the "clash" with > the usual numeric arithmetic, and arithmetic in R where logical is > coerced to integer (and that to "double") when certain numerical > functions/operations are used. > > A more severe problem --- which I had not been aware of until > relatively recently -- is the fact that the CHOLMD function > cholmod_ssdmult(A, B) > treats *both* A and B as "pattern" as soon as one of them is a > (sparse) pattern matrix. > And this is - I say - in clear contrast to what R users would expect: > If you multiply a numeric with a "kind of logical" matrix (a pattern > one), you will expect that the > TRUE/FALSE matrix will be treated as a 1/0 matrix because it is > combined with a numeric matrix. > So we could say that in this case, the Matrix package behavior is > clearly bugous but still it has been the behavior for the last 10 > years or so. > > RFC 1: "Change 1": > I currently propose to change this behavior for the upcoming release > of Matrix (version 1.2-0), though I have no idea if dependent > packages would partly fail their checks or otherwise have changed > behavior subsequently. > The change seems sensible, since I think if your package relied on > this behavior, it was inadvertent and accidental. > Still you may differ in your opinion about this change nr.1 > > RFC 2: "Change 2": > This change would be more radical, and something I would not plan for > the upcoming release of Matrix, but possibly for an update say one or > two months later or so: It concerns the matrix products when *both* > matrices are pattern. A situation where the boolean arithmetic may > really make sense and where indeed packages may have depended on the > current behavior ("T + T |--> T"). ... although that is currently > only used for *sparse* pattern matrices, not for dense ones. > > Further, it may still seem surprising that matrix multiplication does > not behave numerically for a pair of such matrices, and by the > principle of "least surprise" we should provide the boolean arithmetic > matrix products in another way than by the standard %*%, > crossprod() and tcrossprod() functions. > So one possibility could be to change the standard functions to behave > numerically, > and e.g., use %&% (replace the numeric "*" by a logical "&") and > crossprod(A,B, boolean=TRUE), tcrossprod(A,B, boolean=TRUE) > for the three boolean arithmetic version of matrix multiplications. > > What do you think about this? I'm particularly interested to hear > from authors and users of packages such as 'arules' which IIRC > explicitly work with sparse pattern matrices. > > Thank you for your thoughts and creative ideas, > Martin Maechler, ETH Zurich __ R-devel@r-project.org mailing list https://stat.eth
Re: [Rd] RFC: Matrix package: Matrix products (%*%, crossprod, tcrossprod) involving "nsparseMatrix" aka sparse pattern matrices
> Trevor Hastie > on Thu, 19 Mar 2015 16:03:38 -0700 writes: > Hi Martin > I got stung by this last week. > glmnet produces a coefficient matrix of class “dgCMatrix” > If a predictor matrix was created using sparseMatrix as follows, > one gets unexpected results, as this simple example shows. > My fix was easy (I always convert the predictor matrix to class “dgCMatrix” now) > Trevor >> y=Matrix(diag(4)) Considerably faster (for larger n): Diagonal(4) if you want a sparse matrix directly, there are .sparseDiagonal() and .symDiagonal() function >> y > 4 x 4 diagonal matrix of class "ddiMatrix" > [,1] [,2] [,3] [,4] > [1,]1... > [2,].1.. > [3,]..1. > [4,]...1 there's no problem with 'y' which is a "diagonalMatrix" and only needs O(n) storage rather than diag(n), right ? >> z=sparseMatrix(1:4,1:4) >> z > 4 x 4 sparse Matrix of class "ngCMatrix" > [1,] | . . . > [2,] . | . . > [3,] . . | . > [4,] . . . | >> beta=as(Matrix(1:4),"dgCMatrix") >> y%*%beta > 4 x 1 sparse Matrix of class "dgCMatrix" > [1,] 1 > [2,] 2 > [3,] 3 > [4,] 4 >> z%*%beta > 4 x 1 sparse Matrix of class "ngCMatrix" > [1,] | > [2,] | > [3,] | > [4,] | >> Yes, the last one is what I consieder bogous. Thank you, Trevor, for the feedback! Martin >> On Mar 19, 2015, at 3:02 PM, Martin Maechler wrote: >> >> This is a Request For Comment, also BCCed to 390 package maintainers >> of reverse dependencies of the Matrix package. >> >> Most users and package authors working with our 'Matrix' package will >> be using it for numerical computations, and so will be using >> "dMatrix" (d : double precision) matrix objects M, and indirectly, e.g., for >> M >= c will also use "lMatrix" (l: logical i.e. TRUE/FALSE/NA). >> All the following is **not** affecting those numerical / logical >> computations. >> >> A few others will know that we also have "pattern" matrices (purely >> binary: TRUE/FALSE, no NA) notably sparse ones, those "ngCMatrix" etc, >> all starting with "n" (from ``patter[n]``) which do play a prominent >> role in the internal sparse matrix algorithms, notably of the >> (underlying C code) CHOLMOD library in the so-called "symbolic" >> cholesky decomposition and other such operations. Another reason you >> may use them because they are equivalent to incidence matrices of >> unweighted (directed or undirected) graphs. >> >> Now, as the subject says, I'm bringing up the topic of what should >> happen when these matrices appear in matrix multiplications. >> Somewhat by design, but also partly by coincidence, the *sparse* >> pattern matrices multiplication in the Matrix package mostly builds on >> the CHOLMOD library `cholmod_ssmult()` function which implements >> "Boolean arithmetic" for them, instead of regular arithmetic: >> "+" is logical "or" >> "*" is logical "and". >> Once we map TRUE <-> 1 and FALSE <-> 0, the only difference between >> boolean and regular arithmetic is that "1+1 = 1" in the (mapped) >> boolean arithmetic, because "TRUE | TRUE" is TRUE in original logic. >> >> The drawback of using the boolean arithmetic here is the "clash" with >> the usual numeric arithmetic, and arithmetic in R where logical is >> coerced to integer (and that to "double") when certain numerical >> functions/operations are used. >> >> A more severe problem --- which I had not been aware of until >> relatively recently -- is the fact that the CHOLMD function >> cholmod_ssdmult(A, B) >> treats *both* A and B as "pattern" as soon as one of them is a >> (sparse) pattern matrix. >> And this is - I say - in clear contrast to what R users would expect: >> If you multiply a numeric with a "kind of logical" matrix (a pattern >> one), you will expect that the >> TRUE/FALSE matrix will be treated as a 1/0 matrix because it is >> combined with a numeric matrix. >> So we could say that in this case, the Matrix package behavior is >> clearly bugous but still it has been the behavior for the last 10 >> years or so. >> >> RFC 1: "Change 1": >> I currently propose to change this behavior for the upcoming release >> of Matrix (version 1.2-0), though I have no idea if dependent >> packages would partly fail their checks or otherwise have changed >> behavior subsequently. >> The change seems sensible, since I think if your package relied on >> this behavior, it was inadvertent and accidental. >> Still you may differ in your opinion about this change nr.1 >> >> RFC 2: "Change 2": >> This change would be more radical, an
Re: [Rd] RFC: Matrix package: Matrix products (%*%, crossprod, tcrossprod) involving "nsparseMatrix" aka sparse pattern matrices
> "MH" == Michael Hahsler > on Thu, 19 Mar 2015 20:15:37 -0500 writes: MH> Hi Martin, MH> package arules heavily relies on ngCMatrix and uses multiplication and MH> addition for logical operations. I think it makes sense that in a mixed MH> operation with one dgCMatrix and one ngCMatrix the ngCMatrix should be MH> "promoted" to a dgCMatrix. MH> The current behavior of %*% and friends is in deed confusing: >> m <- matrix(sample(c(0,1), 5*5, replace=TRUE), nrow=5) >> x <- as(m, "dgCMatrix") >> y <- as(m, "ngCMatrix") >> x %*% y MH> 5 x 5 sparse Matrix of class "ngCMatrix" MH> [1,] | | | . | MH> [2,] | | | . | MH> [3,] . . | | . MH> [4,] . . . | . MH> [5,] | | | | | >> x %*% x MH> 5 x 5 sparse Matrix of class "dgCMatrix" MH> [1,] 1 2 1 . 2 MH> [2,] 1 3 1 . 3 MH> [3,] . . 1 2 . MH> [4,] . . . 1 . MH> [5,] 1 2 2 1 2 Indeed, that is not what one should expect. MH> We even explicitly coerce in our code ngCMatrix to dgCMatrix to avoid MH> this behavior. I think all these operations probably should result MH> consistently in a dgCMatrix. Eventually. As I said, it *is* useful to work with boolean arithmetic in some cases here, so I do want to provide that .. hopefully entirely consistently as well in the future, but longer term not via '%*%' MH> I would love to see | and & for position-wise AND and OR for ngCMatrix. Well, why don't you look? ;-) These have worked for a long time already! (I checked a version from 2008) Thanks a lot, Michael, for your valuable feedback. Martin MH> Thanks, MH> -Michael MH> On 03/19/2015 05:02 PM, Martin Maechler wrote: >> This is a Request For Comment, also BCCed to 390 package maintainers >> of reverse dependencies of the Matrix package. >> >> Most users and package authors working with our 'Matrix' package will >> be using it for numerical computations, and so will be using >> "dMatrix" (d : double precision) matrix objects M, and indirectly, e.g., for >> M >= c will also use "lMatrix" (l: logical i.e. TRUE/FALSE/NA). >> All the following is **not** affecting those numerical / logical >> computations. >> >> A few others will know that we also have "pattern" matrices (purely >> binary: TRUE/FALSE, no NA) notably sparse ones, those "ngCMatrix" etc, >> all starting with "n" (from ``patter[n]``) which do play a prominent >> role in the internal sparse matrix algorithms, notably of the >> (underlying C code) CHOLMOD library in the so-called "symbolic" >> cholesky decomposition and other such operations. Another reason you >> may use them because they are equivalent to incidence matrices of >> unweighted (directed or undirected) graphs. >> >> Now, as the subject says, I'm bringing up the topic of what should >> happen when these matrices appear in matrix multiplications. >> Somewhat by design, but also partly by coincidence, the *sparse* >> pattern matrices multiplication in the Matrix package mostly builds on >> the CHOLMOD library `cholmod_ssmult()` function which implements >> "Boolean arithmetic" for them, instead of regular arithmetic: >> "+" is logical "or" >> "*" is logical "and". >> Once we map TRUE <-> 1 and FALSE <-> 0, the only difference between >> boolean and regular arithmetic is that "1+1 = 1" in the (mapped) >> boolean arithmetic, because "TRUE | TRUE" is TRUE in original logic. >> >> The drawback of using the boolean arithmetic here is the "clash" with >> the usual numeric arithmetic, and arithmetic in R where logical is >> coerced to integer (and that to "double") when certain numerical >> functions/operations are used. >> >> A more severe problem --- which I had not been aware of until >> relatively recently -- is the fact that the CHOLMD function >> cholmod_ssdmult(A, B) >> treats *both* A and B as "pattern" as soon as one of them is a >> (sparse) pattern matrix. >> And this is - I say - in clear contrast to what R users would expect: >> If you multiply a numeric with a "kind of logical" matrix (a pattern >> one), you will expect that the >> TRUE/FALSE matrix will be treated as a 1/0 matrix because it is >> combined with a numeric matrix. >> So we could say that in this case, the Matrix package behavior is >> clearly bugous but still it has been the behavior for the last 10 >> years or so. >> >> RFC 1: "Change 1": >> I currently propose to change this behavior for the upcoming release >> of Matrix (version 1.2-0), though I have no idea if dependent >> packages would partly fail their checks or otherwise have changed >> behavior subsequently. >> The change seems sensible, since I think if your package relied on >> this behavior, it was inadvertent and accidental. >
Re: [Rd] CRAN binary, but no source
On 19/03/2015 19:26, Duncan Murdoch wrote: On 19/03/2015 2:55 PM, peter dalgaard wrote: > On 19 Mar 2015, at 19:45 , Gábor Csárdi wrote: > > On Thu, Mar 19, 2015 at 2:19 PM, Dan Tenenbaum > wrote: > [...] > >> >> In github? ;-) >> > > Well, that's the thing. If github/cran is a read-only mirror, then should I > delete these versions from there, too? :) On CRAN not just the files are > missing, but these versions are also missing from the RDS database. So they > won't be coming back I assume? > Perhaps you should stop guessing and start asking the CRAN maintainers? Hint: c...@r-project.org I did: the problem was that the source had a license violation, so CRAN can't keep it online. (It had some GPL-3 code, but was released under GPL-2.) Two places to look for such information on CRAN: - Where a fair amount of information needs to be given (like exactly which versions have been removed), there may be a README in the Archive directory. E.g. http://cran.r-project.org/src/contrib/Archive/sdcTable/README , and I have added one for Rglpk. - There is a DCF file http://cran.r-project.org/src/contrib/PACKAGES.in which acts as the database which R CMD check --as-cran consults. That contains concise records of archival and removal. Its coverage is pretty good for the last three years. -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN binary, but no source
On Fri, Mar 20, 2015 at 11:03 AM, Prof Brian Ripley wrote: [...] > > Two places to look for such information on CRAN: > > - Where a fair amount of information needs to be given (like exactly which > versions have been removed), there may be a README in the Archive > directory. E.g. http://cran.r-project.org/src/contrib/Archive/sdcTable/ > README , and I have added one for Rglpk. > > - There is a DCF file http://cran.r-project.org/src/contrib/PACKAGES.in > which acts as the database which R CMD check --as-cran consults. That > contains concise records of archival and removal. Its coverage is pretty > good for the last three years. Ah, yes, I forgot about PACKAGES.in. (Which I actually parse in another project.) I guess I was mostly puzzled by the non-removal of the binary package. (Which is just temporary, and it will be removed as soon as 0.6-0 is available for that platform.) Thanks to everyone who replied, including CRAN maintainers in private. Gabor [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] quieting the "apparent S3 methods" warning
Dear R-devel, Recent versions of R CMD check have been flagging apparent S3 methods that are not registered in the NAMESPACE as such. In most situations this is very helpful. However, I have few cases in existing packages where we have unfortunately named functions using a "." in them that makes them appear as S3 methods when they are not. As there is no existing class corresponding to the last suffix of the function, I could quiet the warning by registering the "fake" S3 function, but this seems contrary to the intent and not very future-proof. Is there a way to register methods as "non-S3 methods" so as to block any potential S3 dispatch? Or is there any other way to quiet the warning? In the long term, I imagine we can deprecate the function and replace it with a better name? best, -skye __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: Matrix package: Matrix products (%*%, crossprod, tcrossprod) involving "nsparseMatrix" aka sparse pattern matrices
Hi Martin, many thanks to you and Doug for providing the Matrix package in the first place, and, second, for taking us into this decision. I have only some minor comments to make: + wherever there is a usual function call involved, using an argument "boolean" as you proposed seems perfect to me + default behaviour and default values in function arguments should, even if bugous, stick to the old behaviour for backward compatibility right now, but you might still want to change this after a long enough announcement period + when it comes to arithmetic symbols, something like %&% certainly is nice to have, but the inadvertent user (like me, probably) would not know of this, unless this is documented at a prominent place + although this is against the functional paradigm of R, I would --exceptionally-- opt for a global option to change the behaviour (a) in function argument defaults and (b), more importantly, in binary arithmetic operators like %*%, *, + --- this way everybody can have the Matrix flavour he likes just my 2c, best regards, Peter Am 19.03.2015 um 23:02 schrieb Martin Maechler: > This is a Request For Comment, also BCCed to 390 package maintainers > of reverse dependencies of the Matrix package. > > Most users and package authors working with our 'Matrix' package will > be using it for numerical computations, and so will be using > "dMatrix" (d : double precision) matrix objects M, and indirectly, e.g., > for > M >= c will also use "lMatrix" (l: logical i.e. TRUE/FALSE/NA). > All the following is **not** affecting those numerical / logical > computations. > > A few others will know that we also have "pattern" matrices (purely > binary: TRUE/FALSE, no NA) notably sparse ones, those "ngCMatrix" etc, > all starting with "n" (from ``patter[n]``) which do play a prominent > role in the internal sparse matrix algorithms, notably of the > (underlying C code) CHOLMOD library in the so-called "symbolic" > cholesky decomposition and other such operations. Another reason you > may use them because they are equivalent to incidence matrices of > unweighted (directed or undirected) graphs. > > Now, as the subject says, I'm bringing up the topic of what should > happen when these matrices appear in matrix multiplications. > Somewhat by design, but also partly by coincidence, the *sparse* > pattern matrices multiplication in the Matrix package mostly builds on > the CHOLMOD library `cholmod_ssmult()` function which implements > "Boolean arithmetic" for them, instead of regular arithmetic: > "+" is logical "or" > "*" is logical "and". > Once we map TRUE <-> 1 and FALSE <-> 0, the only difference between > boolean and regular arithmetic is that "1+1 = 1" in the (mapped) > boolean arithmetic, because "TRUE | TRUE" is TRUE in original logic. > > The drawback of using the boolean arithmetic here is the "clash" with > the usual numeric arithmetic, and arithmetic in R where logical is > coerced to integer (and that to "double") when certain numerical > functions/operations are used. > > A more severe problem --- which I had not been aware of until > relatively recently -- is the fact that the CHOLMD function > cholmod_ssdmult(A, B) > treats *both* A and B as "pattern" as soon as one of them is a > (sparse) pattern matrix. > And this is - I say - in clear contrast to what R users would expect: > If you multiply a numeric with a "kind of logical" matrix (a pattern > one), you will expect that the > TRUE/FALSE matrix will be treated as a 1/0 matrix because it is > combined with a numeric matrix. > So we could say that in this case, the Matrix package behavior is > clearly bugous but still it has been the behavior for the last 10 > years or so. > > RFC 1: "Change 1": > I currently propose to change this behavior for the upcoming release > of Matrix (version 1.2-0), though I have no idea if dependent > packages would partly fail their checks or otherwise have changed > behavior subsequently. > The change seems sensible, since I think if your package relied on > this behavior, it was inadvertent and accidental. > Still you may differ in your opinion about this change nr.1 > > RFC 2: "Change 2": > This change would be more radical, and something I would not plan for > the upcoming release of Matrix, but possibly for an update say one or > two months later or so: It concerns the matrix products when *both* > matrices are pattern. A situation where the boolean arithmetic may > really make sense and where indeed packages may have depended on the > current behavior ("T + T |--> T"). ... although that is currently > only used for *sparse* pattern matrices, not for dense ones. > > Further, it may still seem surprising that matrix multiplication does > not behave numerically for a pair of such matrices, and by the > principle of "least surprise" we should provide the boolean arithmetic > matrix products in another way than by the standard %*%, > crossprod()
Re: [Rd] R with Array Hashes
On Fri, Mar 6, 2015 at 10:03 AM, Jeffrey Horner wrote: > On Fri, Mar 6, 2015 at 9:36 AM, Dirk Eddelbuettel wrote: [...] >> When you asked about benchmark code on Twitter, I shared the somewhat >> well-known (but no R ...) http://benchmarksgame.alioth.debian.org/ >> Did you write new benchmarks? Did you try the ones once assembled by Simon? I added some runs of that benchmark which you can view here: https://github.com/jeffreyhorner/R-Array-Hash/tree/master/benchmarks/runs Scope out the ones that start with R-benchmark-25.* R-Array-Hash and R-devel are very similar using ATLAS. [...] >> Dirk >> >> -- >> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel