Re: [R] Can glmnet handle models with numeric and categorical data?

Paul Smith Fri, 05 Aug 2011 04:01:38 -0700

On Fri, Aug 5, 2011 at 8:45 AM, Martin Maechler
<maech...@stat.math.ethz.ch> wrote:
> Note the following: As soon as you use "categorical predictors",
> i.e., factors, and particularly when these have many levels (instead of just
> being binary), the resulting model matrix is often sparse,
> i.e. contains many zeros.
> When the matrix is ``really sparse',say,
>     #{zeros} / #{non-zeros} >= 10
> it can pay much to use the sparse matrices that the 'Matrix'
> package provides (you have 'Matrix' as part of your R
> installation).
>
> For exactly this reason,  'glmnet'
> has supported the use of sparse matrices for a long time,
> and we have provided the convenience function
>    sparse.model.matrix()  {package 'Matrix'}
> for easy construction of such matrices.
>
> There's also a very small extension package  'MatrixModels'
> which goes one step further, with its function
>      model.Matrix(..... sparse = TRUE/FALSE)
> but you would not need that for using the sparseMatrix in
> 'glmnet'.


Thanks, Martin. In my case, the number of potential predictors is high
and many of them are factors with 5 categories. With
sparse.model.matrix(), I am getting the following error :

«Error: C stack usage is too close to the limit.»

I realize that my sparse matrix is huge -- and the error given by
sparse.model.matrix() perfectly justified --, but I wonder whether
this problem can be overcome by having sparse.model.matrix() using
dynamic memory instead of static one.

Paul

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can glmnet handle models with numeric and categorical data?

Reply via email to