Re: [Rd] Windows gcc toolchain for R 3.2.0

2015-03-19 Thread Duncan Murdoch
I have updated and moved the notes on the new toolchain.  Their URL is

https://rawgit.com/kevinushey/RToolsToolchainUpdate/master/mingwnotes.html

Thanks to Kevin for setting this up.  Anyone who can solve the problems
on that page, or who finds a new problem, please get in contact with us
by email or on Github.

Duncan Murdoch

On 18/03/2015 9:27 AM, Duncan Murdoch wrote:
> To anyone following the Windows toolchain saga:
> 
> The gcc 4.9.2 toolchain that is currently in Rtools33 has too many 
> incompatibilities with existing code, so we won't be using it in the R 
> 3.2.0 build.  I will soon be uploading to CRAN a new version of Rtools33 
> that is very similar to Rtools32, containing gcc 4.6.3.
> 
> We are continuing to work on the new toolchain, and hope to have it 
> ready before R 3.2.1 is released.
> 
> The known problems are as follows:
> 
>- C++ code should not call Rf_error(), as it uses longjmp, and the 
> behaviour of longjmp is undefined in C++ when destructors need to be 
> called.  However, a number of packages do call Rf_error, and in gcc 
> 4.6.3, they get away with it.  In our candidate 4.9.2 build, they 
> crashed.  If we can't work around this, I'll suggest that we test for 
> the presence of Rf_error in C++ code, and start issuing warnings or 
> errors when it is seen.  But before we do that, we need a solid replacement.
> 
>   - There are some other crashes that appear to be unrelated, also with 
> C++ code.
> 
>   - There are some subtle differences in arithmetic that result in tests 
> failing.  These may be due to bugs in MinGW-w64 code,
> or may be unavoidable.
> 
> Duncan Murdoch
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] nls

2015-03-19 Thread Prof J C Nash (U30A)
nls() is using
1) only a Gauss-Newton code which is prone to some glitches
2) approximate derivatives

Package nlmrt uses symbolic derivatives for expressions (you have to
provide Jacobian code for R functions) and an aggressive Marquardt
method to try to reduce the sum of squares. It does return more
information about the problem (singular values of the final Jacobian
and gradient at the proposed solution) but does NOT return the nls
structured object. And it will usually take more time and computing
effort because it tries hard to reduce the SS.

A reproducible example would get you a more informed response.

John Nash


On 15-03-19 07:00 AM, r-devel-requ...@r-project.org wrote:
> Date: Wed, 18 Mar 2015 14:14:12 +0200
> From: Evans Otieno Ochiaga 
> To: r-devel@r-project.org
> Subject: [Rd] Help
> Message-ID:
>   
> Content-Type: text/plain; charset="UTF-8"
> 
> Hi to All,
> 
> I am fitting some models to a data using non linear least square, and
> whenever i run the command, parameters value have good convergence but I
> get the  error in red as shown below. Kindly how can I fix this problem.
> 
> 
> Convergence of parameter values
> 
> 0.2390121 :  0.1952981 0.975 1.000
> 0.03716107 :  0.1553976 0.910 1.000
> 0.009478433 :  0.2011017 0.798 1.000
> 0.004108196 :  0.2640111 0.693 1.000
> 0.003705189 :  0.2938360 0.652 1.000
> 0.003702546 :  0.2965745 0.650 1.000
> 0.003702546 :  0.2965898 0.650 1.000
> 0.003702546 :  0.2965898 0.650 1.000
> 0.003702546 :  0.2965898 0.650 1.000
> 
> Error in nls(Occupancy ~ 1 - (theta * beta^(2 * Resolution^(1/2)) *
> delta^Resolution),  :
>   step factor 0.000488281 reduced below 'minFactor' of 0.000976562
> 
> Regards,
> 
> 
> 
> 
> *Evans Ochiaga*

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] CRAN binary, but no source

2015-03-19 Thread Gábor Csárdi
Hi All,

this is a CRAN question, so I am sorry if this is not the appropriate forum.

I noticed that there is at least one CRAN package that has a binary (OSX
Mavericks) for a version, that does not have any source package on CRAN. Or
at least I am unable to locate it. The package is Rglpk:
http://cran.r-project.org/web/packages/Rglpk/index.html

It offers a binary for 0.5-2, but there is no 0.5-2 source package
anywhere. Is this simply a mistake, or it is OK to have binary-only
packages on CRAN?

Thanks much, Best,
Gabor

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread John McKown
On Thu, Mar 19, 2015 at 10:46 AM, Gábor Csárdi  wrote:
> Hi All,
>
> this is a CRAN question, so I am sorry if this is not the appropriate forum.
>
> I noticed that there is at least one CRAN package that has a binary (OSX
> Mavericks) for a version, that does not have any source package on CRAN. Or
> at least I am unable to locate it. The package is Rglpk:
> http://cran.r-project.org/web/packages/Rglpk/index.html

I went there and say a source package:



Downloads:

Reference manual:Rglpk.pdf
Package source:Rglpk_0.6-0.tar.gz  <<>>
Windows binaries:r-devel: Rglpk_0.6-0.zip, r-release: Rglpk_0.6-0.zip,
r-oldrel: Rglpk_0.6-0.zip
OS X Snow Leopard binaries:r-release: Rglpk_0.6-0.tgz, r-oldrel: Rglpk_0.6-0.tgz
OS X Mavericks binaries:r-release: Rglpk_0.5-2.tgz
Old sources:Rglpk archive


Or: http://cran.r-project.org/src/contrib/Rglpk_0.6-0.tar.gz

>
> It offers a binary for 0.5-2, but there is no 0.5-2 source package
> anywhere. Is this simply a mistake, or it is OK to have binary-only
> packages on CRAN?
>
> Thanks much, Best,
> Gabor
>
> [[alternative HTML version deleted]]

Please, no HTML, per forum rules.


-- 
If you sent twitter messages while exploring, are you on a textpedition?

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread John McKown
On Thu, Mar 19, 2015 at 10:54 AM, John McKown
 wrote:
> On Thu, Mar 19, 2015 at 10:46 AM, Gábor Csárdi  wrote:
>> Hi All,
>>
>> this is a CRAN question, so I am sorry if this is not the appropriate forum.
>>
>> I noticed that there is at least one CRAN package that has a binary (OSX
>> Mavericks) for a version, that does not have any source package on CRAN. Or
>> at least I am unable to locate it. The package is Rglpk:
>> http://cran.r-project.org/web/packages/Rglpk/index.html
>
> I went there and say a source package:
>
> 
>
> Downloads:
>
> Reference manual:Rglpk.pdf
> Package source:Rglpk_0.6-0.tar.gz  <<>>
> Windows binaries:r-devel: Rglpk_0.6-0.zip, r-release: Rglpk_0.6-0.zip,
> r-oldrel: Rglpk_0.6-0.zip
> OS X Snow Leopard binaries:r-release: Rglpk_0.6-0.tgz, r-oldrel: 
> Rglpk_0.6-0.tgz
> OS X Mavericks binaries:r-release: Rglpk_0.5-2.tgz
> Old sources:Rglpk archive
> 
>
> Or: http://cran.r-project.org/src/contrib/Rglpk_0.6-0.tar.gz
>
>>
>> It offers a binary for 0.5-2, but there is no 0.5-2 source package
>> anywhere. Is this simply a mistake, or it is OK to have binary-only
>> packages on CRAN?

OOPS, I saw the 0.6 package source, not 0.5. My mistake. Why not
recompile? Do you require 0.5 for some reason? I would guess that CRAN
requires only the _current_ source, not _every_ source. And
http://cran.r-project.org/src/contrib/Archive/Rglpk/ only has the
source archived for 0.4-1

>>
>> Thanks much, Best,
>> Gabor



-- 
If you sent twitter messages while exploring, are you on a textpedition?

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread Gábor Csárdi
On Thu, Mar 19, 2015 at 11:54 AM, John McKown 
wrote:

> On Thu, Mar 19, 2015 at 10:46 AM, Gábor Csárdi 
> wrote:
>
[...]

> > http://cran.r-project.org/web/packages/Rglpk/index.html
>
> I went there and say a source package:
>
> 
>
> Downloads:
>
> Reference manual:Rglpk.pdf
> Package source:Rglpk_0.6-0.tar.gz  <<>>
> Windows binaries:r-devel: Rglpk_0.6-0.zip, r-release: Rglpk_0.6-0.zip,
> r-oldrel: Rglpk_0.6-0.zip
> OS X Snow Leopard binaries:r-release: Rglpk_0.6-0.tgz, r-oldrel:
> Rglpk_0.6-0.tgz
> OS X Mavericks binaries:r-release: Rglpk_0.5-2.tgz
> Old sources:Rglpk archive
> 
>
> Or: http://cran.r-project.org/src/contrib/Rglpk_0.6-0.tar.gz


Yes, sorry, what I meant is that there is no source package for version
version 0.5-2 here:
http://cran.r-project.org/src/contrib/Archive/Rglpk/

I guess it was accidentally deleted, because the CRAN@github mirror has it:
https://github.com/cran/Rglpk/commits/master

Gabor

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread Gábor Csárdi
On Thu, Mar 19, 2015 at 11:59 AM, John McKown 
wrote:
[...]
>
> OOPS, I saw the 0.6 package source, not 0.5. My mistake. Why not
> recompile? Do you require 0.5 for some reason? I would guess that CRAN
> requires only the _current_ source, not _every_ source.


Well, it seems to me that for the OSX Mavericks platform 0.5-2 is the
current version.

Gabor

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread Dan Tenenbaum


- Original Message -
> From: "Gábor Csárdi" 
> To: "John McKown" 
> Cc: r-devel@r-project.org
> Sent: Thursday, March 19, 2015 9:03:37 AM
> Subject: Re: [Rd] CRAN binary, but no source
> 
> On Thu, Mar 19, 2015 at 11:59 AM, John McKown
> 
> wrote:
> [...]
> >
> > OOPS, I saw the 0.6 package source, not 0.5. My mistake. Why not
> > recompile? Do you require 0.5 for some reason? I would guess that
> > CRAN
> > requires only the _current_ source, not _every_ source.
> 
> 
> Well, it seems to me that for the OSX Mavericks platform 0.5-2 is the
> current version.


Because the latest version failed to build on Mavericks:

http://www.r-project.org/nosvn/R.check/r-release-osx-x86_64-mavericks/Rglpk-00install.html

Possibly because a system requirement is not installed.

Dan


> 
> Gabor
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread Gábor Csárdi
On Thu, Mar 19, 2015 at 2:04 PM, Dan Tenenbaum 
wrote:
[...]

>
> Because the latest version failed to build on Mavericks:
>
>
> http://www.r-project.org/nosvn/R.check/r-release-osx-x86_64-mavericks/Rglpk-00install.html
>
> Possibly because a system requirement is not installed.
>

Thanks, indeed.

My question is not "why do we have 0.5-2 for OSX Mavericks?", but rather
"where is the source code of Rglpk-0.5-2?

Sorry for not making it clear.

Gabor


>
> Dan
>
>
> >
> > Gabor
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread Dan Tenenbaum


- Original Message -
> From: "Gábor Csárdi" 
> To: "Dan Tenenbaum" 
> Cc: r-devel@r-project.org, "John McKown" 
> Sent: Thursday, March 19, 2015 11:15:47 AM
> Subject: Re: [Rd] CRAN binary, but no source
> 
> 
> 
> 
> On Thu, Mar 19, 2015 at 2:04 PM, Dan Tenenbaum <
> dtene...@fredhutch.org > wrote:
> [...]
> 
> 
> 
> 
> 
> Because the latest version failed to build on Mavericks:
> 
> http://www.r-project.org/nosvn/R.check/r-release-osx-x86_64-mavericks/Rglpk-00install.html
> 
> Possibly because a system requirement is not installed.
> 
> 
> 
> Thanks, indeed.
> 
> 
> My question is not "why do we have 0.5-2 for OSX Mavericks?", but
> rather "where is the source code of Rglpk-0.5-2?
> 

In github? ;-)

Dan


> 
> Sorry for not making it clear.
> 
> 
> Gabor
> 
> 
> 
> Dan
> 
> 
> > 
> > Gabor
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> > 
> 
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread Gábor Csárdi
On Thu, Mar 19, 2015 at 2:19 PM, Dan Tenenbaum 
wrote:
[...]

>
> In github? ;-)
>

Well, that's the thing. If github/cran is a read-only mirror, then should I
delete these versions from there, too? :) On CRAN not just the files are
missing, but these versions are also missing from the RDS database. So they
won't be coming back I assume?

Gabor

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread peter dalgaard

> On 19 Mar 2015, at 19:45 , Gábor Csárdi  wrote:
> 
> On Thu, Mar 19, 2015 at 2:19 PM, Dan Tenenbaum 
> wrote:
> [...]
> 
>> 
>> In github? ;-)
>> 
> 
> Well, that's the thing. If github/cran is a read-only mirror, then should I
> delete these versions from there, too? :) On CRAN not just the files are
> missing, but these versions are also missing from the RDS database. So they
> won't be coming back I assume?
> 

Perhaps you should stop guessing and start asking the CRAN maintainers? Hint: 
c...@r-project.org

-pd

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN binary, but no source

2015-03-19 Thread Duncan Murdoch

On 19/03/2015 2:55 PM, peter dalgaard wrote:

> On 19 Mar 2015, at 19:45 , Gábor Csárdi  wrote:
>
> On Thu, Mar 19, 2015 at 2:19 PM, Dan Tenenbaum 
> wrote:
> [...]
>
>>
>> In github? ;-)
>>
>
> Well, that's the thing. If github/cran is a read-only mirror, then should I
> delete these versions from there, too? :) On CRAN not just the files are
> missing, but these versions are also missing from the RDS database. So they
> won't be coming back I assume?
>

Perhaps you should stop guessing and start asking the CRAN maintainers? Hint: 
c...@r-project.org



I did: the problem was that the source had a license violation, so CRAN 
can't keep it online.  (It had some GPL-3 code, but was released under 
GPL-2.)


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] RFC: Matrix package: Matrix products (%*%, crossprod, tcrossprod) involving "nsparseMatrix" aka sparse pattern matrices

2015-03-19 Thread Martin Maechler
This is a Request For Comment, also BCCed to 390 package maintainers
of reverse dependencies of the Matrix package.

Most users and package authors working with our 'Matrix' package will
be using it for numerical computations, and so will be using
"dMatrix" (d : double precision) matrix objects  M,   and indirectly, e.g., for
M >= c  will also use "lMatrix" (l: logical i.e.  TRUE/FALSE/NA).
All the following is  **not** affecting those numerical / logical
computations.

A few others will know that we also have "pattern" matrices (purely
binary: TRUE/FALSE, no NA) notably sparse ones, those "ngCMatrix" etc,
all starting with "n" (from ``patter[n]``) which do play a prominent
role in the internal sparse matrix algorithms, notably of the
(underlying C code) CHOLMOD library in the so-called "symbolic"
cholesky decomposition and other such operations. Another reason you
may use them because they are equivalent to incidence matrices of
unweighted (directed or undirected) graphs.

Now, as the subject says, I'm bringing up the topic of what should
happen when these matrices appear in matrix multiplications.
Somewhat by design, but also partly by coincidence,  the *sparse*
pattern matrices multiplication in the Matrix package mostly builds on
the CHOLMOD library `cholmod_ssmult()` function which implements
"Boolean arithmetic" for them, instead of regular arithmetic:
 "+" is logical "or"
 "*" is  logical "and".
Once we map  TRUE <-> 1  and  FALSE <-> 0, the only difference between
boolean and regular arithmetic is that "1+1 = 1" in the (mapped)
boolean arithmetic, because  "TRUE | TRUE" is TRUE in original logic.

The drawback of using the boolean arithmetic here is the "clash" with
the usual numeric arithmetic, and arithmetic in R where logical is
coerced to integer (and that to "double") when certain numerical
functions/operations are used.

A more severe problem --- which I had not been aware of until
relatively recently -- is the fact that  the CHOLMD function
cholmod_ssdmult(A, B)
treats *both* A and B as "pattern" as soon as one of them is a
(sparse) pattern matrix.
And this is - I say - in clear contrast to what R users would expect:
If you multiply a numeric with a "kind of logical" matrix (a pattern
one), you will expect that the
TRUE/FALSE matrix will be treated as a 1/0 matrix because it is
combined with a numeric matrix.
So we could say that in this case, the Matrix package behavior is
clearly bugous  but still it has been the behavior for the last 10
years or so.

RFC 1: "Change 1":
I currently propose to change this behavior for the upcoming release
of Matrix (version 1.2-0),  though I have no idea if dependent
packages would partly fail their checks or otherwise have changed
behavior subsequently.
The change seems sensible, since I think if your package relied on
this behavior, it was inadvertent and accidental.
Still you may differ in your opinion about this change nr.1

RFC 2: "Change 2":
This change would be more radical, and something I would not plan for
the upcoming release of Matrix, but possibly for an update say one or
two months later or so:  It concerns the matrix products when *both*
matrices are pattern.  A situation where the boolean arithmetic may
really make sense and where indeed packages may have depended on the
current behavior  ("T + T  |--> T"). ... although that is currently
only used for *sparse* pattern matrices, not for dense ones.

Further, it may still seem surprising that matrix multiplication does
not behave numerically for a pair of such matrices, and by the
principle of "least surprise" we should provide the boolean arithmetic
matrix products in another way than  by the   standard  %*%,
crossprod()  and  tcrossprod() functions.
So one possibility could be to change the standard functions to behave
numerically,
and e.g., use   %&%  (replace the numeric "*" by a logical "&")  and
crossprod(A,B, boolean=TRUE),  tcrossprod(A,B, boolean=TRUE)
for the three  boolean arithmetic  version of matrix multiplications.

What do you think about this?   I'm particularly interested to hear
from authors and users of  packages such as 'arules'  which IIRC
explicitly work with sparse pattern matrices.

Thank you for your thoughts and creative ideas,
Martin Maechler, ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: Matrix package: Matrix products (%*%, crossprod, tcrossprod) involving "nsparseMatrix" aka sparse pattern matrices

2015-03-19 Thread Trevor Hastie
Hi Martin

I got stung by this last week.
glmnet produces a coefficient matrix of class “dgCMatrix”
If a predictor matrix was created using sparseMatrix as follows,
one gets unexpected results, as this simple example shows.
My fix was easy (I always convert the predictor matrix to class “dgCMatrix” now)

Trevor

> y=Matrix(diag(4))
> y
4 x 4 diagonal matrix of class "ddiMatrix"
 [,1] [,2] [,3] [,4]
[1,]1...
[2,].1..
[3,]..1.
[4,]...1
> z=sparseMatrix(1:4,1:4)
> z
4 x 4 sparse Matrix of class "ngCMatrix"

[1,] | . . .
[2,] . | . .
[3,] . . | .
[4,] . . . |
> beta=as(Matrix(1:4),"dgCMatrix")
> y%*%beta
4 x 1 sparse Matrix of class "dgCMatrix"
  
[1,] 1
[2,] 2
[3,] 3
[4,] 4
> z%*%beta
4 x 1 sparse Matrix of class "ngCMatrix"
  
[1,] |
[2,] |
[3,] |
[4,] |
> 

> On Mar 19, 2015, at 3:02 PM, Martin Maechler  
> wrote:
> 
> This is a Request For Comment, also BCCed to 390 package maintainers
> of reverse dependencies of the Matrix package.
> 
> Most users and package authors working with our 'Matrix' package will
> be using it for numerical computations, and so will be using
> "dMatrix" (d : double precision) matrix objects  M,   and indirectly, e.g., 
> for
> M >= c  will also use "lMatrix" (l: logical i.e.  TRUE/FALSE/NA).
> All the following is  **not** affecting those numerical / logical
> computations.
> 
> A few others will know that we also have "pattern" matrices (purely
> binary: TRUE/FALSE, no NA) notably sparse ones, those "ngCMatrix" etc,
> all starting with "n" (from ``patter[n]``) which do play a prominent
> role in the internal sparse matrix algorithms, notably of the
> (underlying C code) CHOLMOD library in the so-called "symbolic"
> cholesky decomposition and other such operations. Another reason you
> may use them because they are equivalent to incidence matrices of
> unweighted (directed or undirected) graphs.
> 
> Now, as the subject says, I'm bringing up the topic of what should
> happen when these matrices appear in matrix multiplications.
> Somewhat by design, but also partly by coincidence,  the *sparse*
> pattern matrices multiplication in the Matrix package mostly builds on
> the CHOLMOD library `cholmod_ssmult()` function which implements
> "Boolean arithmetic" for them, instead of regular arithmetic:
> "+" is logical "or"
> "*" is  logical "and".
> Once we map  TRUE <-> 1  and  FALSE <-> 0, the only difference between
> boolean and regular arithmetic is that "1+1 = 1" in the (mapped)
> boolean arithmetic, because  "TRUE | TRUE" is TRUE in original logic.
> 
> The drawback of using the boolean arithmetic here is the "clash" with
> the usual numeric arithmetic, and arithmetic in R where logical is
> coerced to integer (and that to "double") when certain numerical
> functions/operations are used.
> 
> A more severe problem --- which I had not been aware of until
> relatively recently -- is the fact that  the CHOLMD function
> cholmod_ssdmult(A, B)
> treats *both* A and B as "pattern" as soon as one of them is a
> (sparse) pattern matrix.
> And this is - I say - in clear contrast to what R users would expect:
> If you multiply a numeric with a "kind of logical" matrix (a pattern
> one), you will expect that the
> TRUE/FALSE matrix will be treated as a 1/0 matrix because it is
> combined with a numeric matrix.
> So we could say that in this case, the Matrix package behavior is
> clearly bugous  but still it has been the behavior for the last 10
> years or so.
> 
> RFC 1: "Change 1":
> I currently propose to change this behavior for the upcoming release
> of Matrix (version 1.2-0),  though I have no idea if dependent
> packages would partly fail their checks or otherwise have changed
> behavior subsequently.
> The change seems sensible, since I think if your package relied on
> this behavior, it was inadvertent and accidental.
> Still you may differ in your opinion about this change nr.1
> 
> RFC 2: "Change 2":
> This change would be more radical, and something I would not plan for
> the upcoming release of Matrix, but possibly for an update say one or
> two months later or so:  It concerns the matrix products when *both*
> matrices are pattern.  A situation where the boolean arithmetic may
> really make sense and where indeed packages may have depended on the
> current behavior  ("T + T  |--> T"). ... although that is currently
> only used for *sparse* pattern matrices, not for dense ones.
> 
> Further, it may still seem surprising that matrix multiplication does
> not behave numerically for a pair of such matrices, and by the
> principle of "least surprise" we should provide the boolean arithmetic
> matrix products in another way than  by the   standard  %*%,
> crossprod()  and  tcrossprod() functions.
> So one possibility could be to change the standard functions to behave
> numerically,
> and e.g., use   %&%  (replace the numeric "*" by a logical "&")  and
> crossprod(A,B, boolean=TRUE),  tcrossp

Re: [Rd] RFC: Matrix package: Matrix products (%*%, crossprod, tcrossprod) involving "nsparseMatrix" aka sparse pattern matrices

2015-03-19 Thread Michael Hahsler

Hi Martin,

package arules heavily relies on ngCMatrix and uses multiplication and 
addition for logical operations. I think it makes sense that in a mixed 
operation with one dgCMatrix and one ngCMatrix the ngCMatrix should be 
"promoted" to a dgCMatrix.


The current behavior of %*% and friends is in deed confusing:

> m <- matrix(sample(c(0,1), 5*5, replace=TRUE), nrow=5)
> x <- as(m, "dgCMatrix")
> y <- as(m, "ngCMatrix")
> x %*% y
5 x 5 sparse Matrix of class "ngCMatrix"

[1,] | | | . |
[2,] | | | . |
[3,] . . | | .
[4,] . . . | .
[5,] | | | | |

> x %*% x
5 x 5 sparse Matrix of class "dgCMatrix"

[1,] 1 2 1 . 2
[2,] 1 3 1 . 3
[3,] . . 1 2 .
[4,] . . . 1 .
[5,] 1 2 2 1 2

We even explicitly coerce in our code ngCMatrix to dgCMatrix to avoid 
this behavior. I think all these operations probably should result 
consistently in a dgCMatrix.


I would love to see | and & for position-wise AND and OR for ngCMatrix.

Thanks,
-Michael

On 03/19/2015 05:02 PM, Martin Maechler wrote:

This is a Request For Comment, also BCCed to 390 package maintainers
of reverse dependencies of the Matrix package.

Most users and package authors working with our 'Matrix' package will
be using it for numerical computations, and so will be using
"dMatrix" (d : double precision) matrix objects  M,   and indirectly, e.g., for
M >= c  will also use "lMatrix" (l: logical i.e.  TRUE/FALSE/NA).
All the following is  **not** affecting those numerical / logical
computations.

A few others will know that we also have "pattern" matrices (purely
binary: TRUE/FALSE, no NA) notably sparse ones, those "ngCMatrix" etc,
all starting with "n" (from ``patter[n]``) which do play a prominent
role in the internal sparse matrix algorithms, notably of the
(underlying C code) CHOLMOD library in the so-called "symbolic"
cholesky decomposition and other such operations. Another reason you
may use them because they are equivalent to incidence matrices of
unweighted (directed or undirected) graphs.

Now, as the subject says, I'm bringing up the topic of what should
happen when these matrices appear in matrix multiplications.
Somewhat by design, but also partly by coincidence,  the *sparse*
pattern matrices multiplication in the Matrix package mostly builds on
the CHOLMOD library `cholmod_ssmult()` function which implements
"Boolean arithmetic" for them, instead of regular arithmetic:
  "+" is logical "or"
  "*" is  logical "and".
Once we map  TRUE <-> 1  and  FALSE <-> 0, the only difference between
boolean and regular arithmetic is that "1+1 = 1" in the (mapped)
boolean arithmetic, because  "TRUE | TRUE" is TRUE in original logic.

The drawback of using the boolean arithmetic here is the "clash" with
the usual numeric arithmetic, and arithmetic in R where logical is
coerced to integer (and that to "double") when certain numerical
functions/operations are used.

A more severe problem --- which I had not been aware of until
relatively recently -- is the fact that  the CHOLMD function
cholmod_ssdmult(A, B)
treats *both* A and B as "pattern" as soon as one of them is a
(sparse) pattern matrix.
And this is - I say - in clear contrast to what R users would expect:
If you multiply a numeric with a "kind of logical" matrix (a pattern
one), you will expect that the
TRUE/FALSE matrix will be treated as a 1/0 matrix because it is
combined with a numeric matrix.
So we could say that in this case, the Matrix package behavior is
clearly bugous  but still it has been the behavior for the last 10
years or so.

RFC 1: "Change 1":
I currently propose to change this behavior for the upcoming release
of Matrix (version 1.2-0),  though I have no idea if dependent
packages would partly fail their checks or otherwise have changed
behavior subsequently.
The change seems sensible, since I think if your package relied on
this behavior, it was inadvertent and accidental.
Still you may differ in your opinion about this change nr.1

RFC 2: "Change 2":
This change would be more radical, and something I would not plan for
the upcoming release of Matrix, but possibly for an update say one or
two months later or so:  It concerns the matrix products when *both*
matrices are pattern.  A situation where the boolean arithmetic may
really make sense and where indeed packages may have depended on the
current behavior  ("T + T  |--> T"). ... although that is currently
only used for *sparse* pattern matrices, not for dense ones.

Further, it may still seem surprising that matrix multiplication does
not behave numerically for a pair of such matrices, and by the
principle of "least surprise" we should provide the boolean arithmetic
matrix products in another way than  by the   standard  %*%,
crossprod()  and  tcrossprod() functions.
So one possibility could be to change the standard functions to behave
numerically,
and e.g., use   %&%  (replace the numeric "*" by a logical "&")  and
crossprod(A,B, boolean=TRUE),  tcrossprod(A,B, boolean=TRUE)
for the three  boolean arithmetic  version of matr