[Rd] Portability and Memory Issues for R-package

2005-12-25 Thread KNygren
I have an upcoming JASA paper with an iid sampling algorithm for Bayesian 
Generalized Linear models (e.g., Logit, Poisson Regression, and Conditional 
Logit models with multivariate normal priors). At this point, I have 
implemented the algorithms in C and hope to make the functions and 
corresponding source code available through an R package.   I have successfully 
created the code necessary to create and install a package with most of the 
functions on my local machine (using R CMD check,R CMD build, and R CMD 
INSTALL).  As my code makes extensive use of the GSL matrix library, however, I 
have some questions regarding portability of my package. I am also running into 
some memory issues when making repeated calls to my functions which I would 
hope to be able to fix before making a formal distribution of the package. More 
specifically, the issues are the following:

I. Portability-

Since I make extensive use of the gsl library in my C code, I have the gsl 
library installed (within the MinGw directory so it is included in the path) on 
my local machine. Within the package, I am then including a Makevars file with 
the following code in order to link to the gsl library:

PKG_LIBS=-lgsl -lgslcblas

I also know that there is an R package (gsl) making use of some gsl functions 
which contains a Makevars.win file with the following code:

PKG_LIBS=-LF:/MinGW/usr/local/lib -lgsl -lgslcblas
# CPPFLAGS=-I$(R_HOME)/include -IF:/MinGW/usr/local/include
PKG_CPPFLAGS=-IF:/MinGW/usr/local/include

For my package to install properly on other machines, however, I take it they 
would have to have the gsl library files already installed in the proper 
location (or am I mistaken here?).  In order to make it fully portable on other 
machines, it thus seems like I would need to either include instructions for 
how to first install the gsl library prior to installation (which would have to 
be platform specific), or to somehow have the gsl library files installed 
during the R package installation. Is the latter even possible? If so, how 
could it be done (the key files are likely the two library files)?  I believe 
the gsl package requires the user to have the gsl library preinstalled.  

I guess long-term, an option is for me to rework my C code to eliminate the 
dependence on  the gsl library. This could, however, be a time consuming 
effort. In the meantime would it be possible to contribute the package with the 
existing dependence (as I think is the case for the gsl library).

II.  Memory Issue-

The functions in my package are generally fast and seem to work well if I make 
a limited number of calls to them from my R code. If I try to make use of them 
as part of an R MCMC implementation (say updating each Gibbs block 10,000 times 
in an R loop), I run into memory issues.  Despite the fact that my underlying C 
code frees memory to all pointers, it does not seem like windows recognizes 
that the memory has been freed.  This is apparent as the Mem Usage for RGUI.exe 
in the windows task manager keeps growing throughout the loop and the code 
slows down and eventually makes virtually no progress. I have noticed similar 
issues in the past when calling Winbugs repeatedly using Gelmans functions, so 
it is likely not an issue that is coming just from my code.
I suspect that the memory issues could have something to do with the fact that 
my C code makes repeated use of the gsl_matrix_alloc and gsl_matrix_free 
functions rather than the R_alloc function (I suspect that the memory is not 
Garbage collected).   I searched the web and found the following suggestion 
from Bryan Gouch in response to a similar question posted on the gsl discussion 
forum.
"If you want to return an R object containing a gsl_matrix which can be garbage 
collected then you could use a C++ wrapper, as the C++ interface in R allows 
the use of separate constructors and destructors. "  
Would this be a possible solution?  If so, how can I find information on how to 
write such wrapper functions that will work for gsl matrices? I must admit that 
I am not familiar with how the use of separate constructors and destructors 
would work.  If that is not the solution, would anyone have any other ideas as 
to how I can solve the memory issues.
Kjell Nygren

Kjell Nygren,  Ph.D.
Director Pricing and Advanced Analytics
Statistical Services
IMS Health®
960 Harvest Drive, Building A
Blue Bell, PA 19422 USA
voice: 610.832.5586 *  fax: 610.832.5850
email:   
www.imshealth.com
 
The information contained in this communication is confident...{{dropped}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Portability and Memory Issues for R-package

2005-12-26 Thread KNygren
I was able to get the memory issues resolved, so no need to post a response in 
that regards.   When it comes to the portability issues, I would still like to 
understand how to best deal with it in regards to the gsl library. 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Nygren, Kjell (Union
Meeting)
Sent: Sunday, December 25, 2005 2:35 PM
To: r-devel@r-project.org
Subject: [Rd] Portability and Memory Issues for R-package


I have an upcoming JASA paper with an iid sampling algorithm for Bayesian 
Generalized Linear models (e.g., Logit, Poisson Regression, and Conditional 
Logit models with multivariate normal priors). At this point, I have 
implemented the algorithms in C and hope to make the functions and 
corresponding source code available through an R package.   I have successfully 
created the code necessary to create and install a package with most of the 
functions on my local machine (using R CMD check,R CMD build, and R CMD 
INSTALL).  As my code makes extensive use of the GSL matrix library, however, I 
have some questions regarding portability of my package. I am also running into 
some memory issues when making repeated calls to my functions which I would 
hope to be able to fix before making a formal distribution of the package. More 
specifically, the issues are the following:

I. Portability-

Since I make extensive use of the gsl library in my C code, I have the gsl 
library installed (within the MinGw directory so it is included in the path) on 
my local machine. Within the package, I am then including a Makevars file with 
the following code in order to link to the gsl library:

PKG_LIBS=-lgsl -lgslcblas

I also know that there is an R package (gsl) making use of some gsl functions 
which contains a Makevars.win file with the following code:

PKG_LIBS=-LF:/MinGW/usr/local/lib -lgsl -lgslcblas
# CPPFLAGS=-I$(R_HOME)/include -IF:/MinGW/usr/local/include
PKG_CPPFLAGS=-IF:/MinGW/usr/local/include

For my package to install properly on other machines, however, I take it they 
would have to have the gsl library files already installed in the proper 
location (or am I mistaken here?).  In order to make it fully portable on other 
machines, it thus seems like I would need to either include instructions for 
how to first install the gsl library prior to installation (which would have to 
be platform specific), or to somehow have the gsl library files installed 
during the R package installation. Is the latter even possible? If so, how 
could it be done (the key files are likely the two library files)?  I believe 
the gsl package requires the user to have the gsl library preinstalled.  

I guess long-term, an option is for me to rework my C code to eliminate the 
dependence on  the gsl library. This could, however, be a time consuming 
effort. In the meantime would it be possible to contribute the package with the 
existing dependence (as I think is the case for the gsl library).

II.  Memory Issue-

The functions in my package are generally fast and seem to work well if I make 
a limited number of calls to them from my R code. If I try to make use of them 
as part of an R MCMC implementation (say updating each Gibbs block 10,000 times 
in an R loop), I run into memory issues.  Despite the fact that my underlying C 
code frees memory to all pointers, it does not seem like windows recognizes 
that the memory has been freed.  This is apparent as the Mem Usage for RGUI.exe 
in the windows task manager keeps growing throughout the loop and the code 
slows down and eventually makes virtually no progress. I have noticed similar 
issues in the past when calling Winbugs repeatedly using Gelmans functions, so 
it is likely not an issue that is coming just from my code.
I suspect that the memory issues could have something to do with the fact that 
my C code makes repeated use of the gsl_matrix_alloc and gsl_matrix_free 
functions rather than the R_alloc function (I suspect that the memory is not 
Garbage collected).   I searched the web and found the following suggestion 
from Bryan Gouch in response to a similar question posted on the gsl discussion 
forum.
"If you want to return an R object containing a gsl_matrix which can be garbage 
collected then you could use a C++ wrapper, as the C++ interface in R allows 
the use of separate constructors and destructors. "  
Would this be a possible solution?  If so, how can I find information on how to 
write such wrapper functions that will work for gsl matrices? I must admit that 
I am not familiar with how the use of separate constructors and destructors 
would work.  If that is not the solution, would anyone have any other ideas as 
to how I can solve the memory issues.
Kjell Nygren

Kjell Nygren,  Ph.D.
Director Pricing and Advanced Analytics
Statistical Services
IMS Health®
960 Harvest Drive, Building A
Blue Bell, PA 19422 USA
voice: 610.832.5586 *  fax: 610.832.5850
email:   
www.ims

Re: [Rd] Portability and Memory Issues for R-package

2005-12-27 Thread KNygren
My guess is that the key step for a user to be able to use my package still 
would be to install the gsl library first so it can be accessed during the 
build. I am not sure if Robin has a set of instructions for platform specific 
installation of his package (which would likely include the pre-installation of 
the gsl library). I may follow up with him in regards to this and to see if it 
makes sense to link to his library. I will also look into the possibility of 
adding a configure script (as per Jan's suggestion). I know that the use of the 
gsl library is not ideal, and may eventually try to replace the gsl dependent 
code, perhaps by making use of the R matrix package (though I don't know if it 
has all the features I am currently using).


Kjell Nygren 
 
> I. Portability-
> 
> Since I make extensive use of the gsl library in my C code, I have the gsl 
> library installed (within the MinGw directory so it is included in the path) 
> on my local machine. Within the package, I am then including a Makevars file 
> with the following code in order to link to the gsl library:
> 
> PKG_LIBS=-lgsl -lgslcblas
> 
> I also know that there is an R package (gsl) making use of some gsl functions 
> which contains a Makevars.win file with the following code:

This package requires manual handling to build for Windows, and probably 
for some other platforms if they don't come with gsl by default.

My recommendation would be to work with its author (Robin Hankin, see 
the DESCRIPTION file for contact information) to add whatever functions 
are not already there, and then just make your package depend on the R 
package, rather than on the GSL library directly.

This will mean that all the manual work that has been done to get gsl to 
build will not need to be repeated by anyone who wants to install your 
package.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Portability and Memory Issues for R-package

2005-12-27 Thread KNygren
Not getting users was one of my main concern.  So let me make sure I understand 
the suggestions correctly.

A. I should check if the GSL routines I make use of are part of Brian's binary 
build.  If not, I should look into having the required routines added to that 
build (going through Robin (or perhaps Brian?)). 

B. If the required routines are included in the binary build for the GSL 
package, I can then link my package to the gsl packages and it should work fine 
on windows for any user who has done the installation of the gsl package. I 
take it the binary build also eliminates the need for each user to do the 
manual handling required to build on windows?  

Kjell Nygren  

-Original Message-
From: Duncan Murdoch [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 27, 2005 4:58 PM
To: Nygren, Kjell (Union Meeting)
Cc: r-devel@r-project.org
Subject: Re: [Rd] Portability and Memory Issues for R-package


On 12/27/2005 3:44 PM, [EMAIL PROTECTED] wrote:
> My guess is that the key step for a user to be able to use my package still 
> would be to install the gsl library first so it can be accessed during the 
> build. I am not sure if Robin has a set of instructions for platform specific 
> installation of his package (which would likely include the pre-installation 
> of the gsl library).

This is not necessary on Windows, where most users install binary builds 
of packages, because Brian Ripley has done the work to put together a 
binary build that includes the necessary GSL routines.  I would expect 
that if you require users to install GSL and compile your package 
themselves, you'll get almost no Windows users.  I don't know what is 
involved in installing the package on other platforms.

Duncan Murdoch

> I may follow up with him in regards to this and to see if it makes sense to 
> link to his library. I will also look into the possibility of adding a 
> configure script (as per Jan's suggestion). I know that the use of the gsl 
> library is not ideal, and may eventually try to replace the gsl dependent 
> code, perhaps by making use of the R matrix package (though I don't know if 
> it has all the features I am currently using).
> 
> 
> Kjell Nygren 
>  
> 
>>I. Portability-
>>
>>Since I make extensive use of the gsl library in my C code, I have the gsl 
>>library installed (within the MinGw directory so it is included in the path) 
>>on my local machine. Within the package, I am then including a Makevars file 
>>with the following code in order to link to the gsl library:
>>
>>PKG_LIBS=-lgsl -lgslcblas
>>
>>I also know that there is an R package (gsl) making use of some gsl functions 
>>which contains a Makevars.win file with the following code:
> 
> 
> This package requires manual handling to build for Windows, and probably 
> for some other platforms if they don't come with gsl by default.
> 
> My recommendation would be to work with its author (Robin Hankin, see 
> the DESCRIPTION file for contact information) to add whatever functions 
> are not already there, and then just make your package depend on the R 
> package, rather than on the GSL library directly.
> 
> This will mean that all the manual work that has been done to get gsl to 
> build will not need to be repeated by anyone who wants to install your 
> package.
> 
> Duncan Murdoch
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel