Re: [Rd] Garbage collection of seemingly PROTECTed pairlist

2020-09-12 Thread peter dalgaard
Hm, I'm getting rusty on my C skills, but you repeatedly use a pattern like

>  while (row_num <= nr) {
>Rprintf("row_num: %i\n", row_num);
>SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num));

I wonder if there is a subtle difference between assignments and initialized 
declarations.

I.e. would it help to write it as

>  while (row_num <= nr) {
SEXP row;
Rprintf("row_num: %i\n", row_num);
row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num));

?

-pd

> On 12 Sep 2020, at 00:35 , Rory Nolan  wrote:
> 
> I want to write an R function using R's C interface that takes a 2-column
> matrix of increasing, non-overlapping integer intervals and returns a list
> with those intervals plus some added intervals, such that there are no
> gaps. For example, it should take the matrix rbind(c(5L, 6L), c(7L, 10L),
> c(20L, 30L)) and return list(c(5L, 6L), c(7L, 10L), c(11L, 19L), c(20L,
> 30L)). Because the output is of variable length, I use a pairlist (because
> it is growable) and then I call Rf_PairToVectorList() at the end to make it
> into a regular list.
> 
> I'm getting a strange garbage collection error. My PROTECTed pairlist prlst
> gets garbage collected away and causes a memory leak error when I try to
> access it.
> 
> Here's my code.
> 
> #include 
> 
> 
> SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
>  SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));
>  int *out_int = INTEGER(out);
>  if (n <= 0 | n > nr) {
>for (int i = 0; i != nc; ++i) {
>  out_int[i] = NA_INTEGER;
>}
>  } else {
>for (int i = 0; i != nr; ++i) {
>  out_int[i] = int_mat_int[n - 1 + i * nr];
>}
>  }
>  UNPROTECT(1);
>  return out;}
> 
> SEXP C_make_len2_int_vec(int first, int second) {
>  SEXP out = PROTECT(Rf_allocVector(INTSXP, 2));
>  int *out_int = INTEGER(out);
>  out_int[0] = first;
>  out_int[1] = second;
>  UNPROTECT(1);
>  return out;}
> 
> SEXP C_fullocate(SEXP int_mat) {
>  int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat);
>  int last, row_num;  // row_num will be 1-indexed
>  SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));
>  SEXP prlst = PROTECT(Rf_list1(prlst0cdr));
>  SEXP prlst_tail = prlst;
>  last = INTEGER(prlst0cdr)[1];
>  row_num = 2;
>  while (row_num <= nr) {
>Rprintf("row_num: %i\n", row_num);
>SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num));
>Rf_PrintValue(prlst);  // This is where the error occurs
>int *row_int = INTEGER(row);
>if (row_int[0] == last + 1) {
>  Rprintf("here1");
>  SEXP next = PROTECT(Rf_list1(row));
>  prlst_tail = SETCDR(prlst_tail, next);
>  last = row_int[1];
>  UNPROTECT(1);
>  ++row_num;
>} else {
>  Rprintf("here2");
>  SEXP next_car = PROTECT(C_make_len2_int_vec(last + 1, row_int[0] - 1));
>  SEXP next = PROTECT(Rf_list1(next_car));
>  prlst_tail = SETCDR(prlst_tail, next);
>  last = row_int[0] - 1;
>  UNPROTECT(2);
>}
>UNPROTECT(1);
>  }
>  SEXP out = PROTECT(Rf_PairToVectorList(prlst));
>  UNPROTECT(3);
>  return out;}
> 
> As you can see, I have some diagnostic print statements in there. The
> offending line is line 40, which I have marked with a comment of // This is
> where the error occurs. I have a minimal reproducible package at
> https://github.com/rorynolan/testpkg and I have run R CMD CHECK with
> valgrind using GitHub actions, the results of which are at
> https://github.com/rorynolan/testpkg/runs/1076595757?check_suite_focus=true.
> That's where I found out which line is causing the error. This function
> works as expected sometimes, and then sometimes this issue appears. This
> lends weight to the suspicion that it's a garbage collection issue.
> 
> I really want to know what my mistake is. I'm not that interested in
> alternative implementations; I want to understand the mistake that I'm
> making so that I can avoid making it in future.
> 
> I have asked the question on stackoverflow to little avail, but the
> discussion there may prove helpful.
> https://stackoverflow.com/questions/63759604/garbage-collection-of-seemingly-protected-pairlist.
> 
> 
> 
> Thanks,
> 
> Rory
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Including full text of open source licenses in a package

2020-09-12 Thread Abby Spurdle
> > Including a copy of the license with the work is vital

Hmmm...
Agree.

Just for context:
CRAN has a history of being exceptionally useful and efficient.
In general, I don't support suggestions to change their submission policies.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Including full text of open source licenses in a package

2020-09-12 Thread Hugh Parsonage
Perhaps I have misread that excerpt from WRE, but my read is that
package authors should not duplicate GNU COPYING, since it is present
in all R distributions already when using GPL-2 and friends.  It
doesn't apply to packages distributed with other licenses.

It should be noted that in GPL FAQ just below the part you quoted it says
> A clear statement in the program's README file is legally sufficient as long 
> as that accompanies the code, but it is easy for them to get separated.




On Sat, 12 Sep 2020 at 09:06, Hadley Wickham  wrote:
>
> Hi all,
>
> R-exts currently requests that package authors don't include copies of
> standard licenses:
>
> > Whereas you should feel free to include a license file in your source 
> > distribution, please do
> > not arrange to install yet another copy of the GNU COPYING or COPYING.LIB 
> > files but
> > refer to the copies on https://www.R-project.org/Licenses/ and included in 
> > the R distribution
> > (in directory share/licenses). Since files named LICENSE or LICENCE will be 
> > installed,
> > do not use these names for standard license files.
>
> I'd like to request that this condition be removed because it makes it
> overly difficult to ensure that every version of your package (source,
> tar.gz, binary, and installed) includes the full text of the license.
> This is important because most open source licenses explicitly require
> that you include the full text of the license. For example, the GPL
> faq (http://www.gnu.org/licenses/gpl-faq.html#WhyMustIInclude) states:
>
> > Why does the GPL require including a copy of the GPL with every copy of the 
> > program?
> > (#WhyMustIInclude)
> >
> > Including a copy of the license with the work is vital so that everyone who 
> > gets a copy of
> > the program can know what their rights are.
> >
> > It might be tempting to include a URL that refers to the license, instead 
> > of the license
> > itself. But you cannot be sure that the URL will still be valid, five years 
> > or ten years from
> > now. Twenty years from now, URLs as we know them today may no longer exist.
> >
> > The only way to make sure that people who have copies of the program will 
> > continue
> > to be able to see the license, despite all the changes that will happen in 
> > the network,
> > is to include a copy of the license in the program.
>
> This analysis by an open source lawyer,
> https://writing.kemitchell.com/2016/09/21/MIT-License-Line-by-Line.html#notice-condition,
> reinforces the same message for the MIT license.
>
> Currently we've been working around this limitation by putting a
> markdown version of the license in LICENSE.md and then adding that to
> .Rbuildignore (this ensures that the source version on GitHub includes
> the license even if the CRAN version does not). Ideally, as well as
> allowing us to include full text of licenses in LICENSE or
> LICENSE.txt, a LICENSE.md at the top-level of the package would also
> be explicitly permitted.
>
> Hadley
>
> --
> http://hadley.nz
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Garbage collection of seemingly PROTECTed pairlist

2020-09-12 Thread Iñaki Ucar
Hi,

In line 5, you are allocating a vector of length nc. Then, in line 12, you
are using nr as a limit, so if nr goes beyond nc, which is happening in
line 39, you are in trouble.

Iñaki

On Sat, 12 Sep 2020 at 03:30, Rory Nolan  wrote:

> I want to write an R function using R's C interface that takes a 2-column
> matrix of increasing, non-overlapping integer intervals and returns a list
> with those intervals plus some added intervals, such that there are no
> gaps. For example, it should take the matrix rbind(c(5L, 6L), c(7L, 10L),
> c(20L, 30L)) and return list(c(5L, 6L), c(7L, 10L), c(11L, 19L), c(20L,
> 30L)). Because the output is of variable length, I use a pairlist (because
> it is growable) and then I call Rf_PairToVectorList() at the end to make it
> into a regular list.
>
> I'm getting a strange garbage collection error. My PROTECTed pairlist prlst
> gets garbage collected away and causes a memory leak error when I try to
> access it.
>
> Here's my code.
>
> #include 
>
>
> SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
>   SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));
>   int *out_int = INTEGER(out);
>   if (n <= 0 | n > nr) {
> for (int i = 0; i != nc; ++i) {
>   out_int[i] = NA_INTEGER;
> }
>   } else {
> for (int i = 0; i != nr; ++i) {
>   out_int[i] = int_mat_int[n - 1 + i * nr];
> }
>   }
>   UNPROTECT(1);
>   return out;}
>
> SEXP C_make_len2_int_vec(int first, int second) {
>   SEXP out = PROTECT(Rf_allocVector(INTSXP, 2));
>   int *out_int = INTEGER(out);
>   out_int[0] = first;
>   out_int[1] = second;
>   UNPROTECT(1);
>   return out;}
>
> SEXP C_fullocate(SEXP int_mat) {
>   int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat);
>   int last, row_num;  // row_num will be 1-indexed
>   SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));
>   SEXP prlst = PROTECT(Rf_list1(prlst0cdr));
>   SEXP prlst_tail = prlst;
>   last = INTEGER(prlst0cdr)[1];
>   row_num = 2;
>   while (row_num <= nr) {
> Rprintf("row_num: %i\n", row_num);
> SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2,
> row_num));
> Rf_PrintValue(prlst);  // This is where the error occurs
> int *row_int = INTEGER(row);
> if (row_int[0] == last + 1) {
>   Rprintf("here1");
>   SEXP next = PROTECT(Rf_list1(row));
>   prlst_tail = SETCDR(prlst_tail, next);
>   last = row_int[1];
>   UNPROTECT(1);
>   ++row_num;
> } else {
>   Rprintf("here2");
>   SEXP next_car = PROTECT(C_make_len2_int_vec(last + 1, row_int[0] -
> 1));
>   SEXP next = PROTECT(Rf_list1(next_car));
>   prlst_tail = SETCDR(prlst_tail, next);
>   last = row_int[0] - 1;
>   UNPROTECT(2);
> }
> UNPROTECT(1);
>   }
>   SEXP out = PROTECT(Rf_PairToVectorList(prlst));
>   UNPROTECT(3);
>   return out;}
>
> As you can see, I have some diagnostic print statements in there. The
> offending line is line 40, which I have marked with a comment of // This is
> where the error occurs. I have a minimal reproducible package at
> https://github.com/rorynolan/testpkg and I have run R CMD CHECK with
> valgrind using GitHub actions, the results of which are at
> https://github.com/rorynolan/testpkg/runs/1076595757?check_suite_focus=true
> .
> That's where I found out which line is causing the error. This function
> works as expected sometimes, and then sometimes this issue appears. This
> lends weight to the suspicion that it's a garbage collection issue.
>
> I really want to know what my mistake is. I'm not that interested in
> alternative implementations; I want to understand the mistake that I'm
> making so that I can avoid making it in future.
>
> I have asked the question on stackoverflow to little avail, but the
> discussion there may prove helpful.
>
> https://stackoverflow.com/questions/63759604/garbage-collection-of-seemingly-protected-pairlist
> .
>
>
>
> Thanks,
>
> Rory
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Iñaki Úcar

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Including full text of open source licenses in a package

2020-09-12 Thread Hadley Wickham
On Saturday, September 12, 2020, Hugh Parsonage 
wrote:

> Perhaps I have misread that excerpt from WRE, but my read is that
> package authors should not duplicate GNU COPYING, since it is present
> in all R distributions already when using GPL-2 and friends.  It
> doesn't apply to packages distributed with other licenses.
>
>
The directory to which it refers, https://www.r-project.org/Licenses/,
includes many open source licenses, not just those used for R. I’m also
pretty sure I’ve had a package fail CRAN submission for this problem in the
past.


> It should be noted that in GPL FAQ just below the part you quoted it says
> > A clear statement in the program's README file is legally sufficient as
> long as that accompanies the code, but it is easy for them to get separated.
>

That question (https://www.gnu.org/licenses/gpl-faq.en.html#LicenseCopyOnly) is
about whether a copy of the license in a file is sufficient, or whether you
must also include a statement at the top of every source file.

Hadley


-- 
http://hadley.nz

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Garbage collection of seemingly PROTECTed pairlist

2020-09-12 Thread Rory Nolan
Iñaki is right. This solves it. 
Kudos to Martin Morgan too for paring down my reprex; I wasn’t aware of the 
method with R CMD SHLIB that he kindly described. 
Thanks also to Peter Dalgaard for having a crack at it. Very grateful to you 
all.

> On 12 Sep 2020, at 04:16, Iñaki Ucar  wrote:
> 
> Hi,
> 
> In line 5, you are allocating a vector of length nc. Then, in line 12, you 
> are using nr as a limit, so if nr goes beyond nc, which is happening in line 
> 39, you are in trouble.
> 
> Iñaki
> 
> On Sat, 12 Sep 2020 at 03:30, Rory Nolan  > wrote:
> I want to write an R function using R's C interface that takes a 2-column
> matrix of increasing, non-overlapping integer intervals and returns a list
> with those intervals plus some added intervals, such that there are no
> gaps. For example, it should take the matrix rbind(c(5L, 6L), c(7L, 10L),
> c(20L, 30L)) and return list(c(5L, 6L), c(7L, 10L), c(11L, 19L), c(20L,
> 30L)). Because the output is of variable length, I use a pairlist (because
> it is growable) and then I call Rf_PairToVectorList() at the end to make it
> into a regular list.
> 
> I'm getting a strange garbage collection error. My PROTECTed pairlist prlst
> gets garbage collected away and causes a memory leak error when I try to
> access it.
> 
> Here's my code.
> 
> #include 
> 
> 
> SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
>   SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));
>   int *out_int = INTEGER(out);
>   if (n <= 0 | n > nr) {
> for (int i = 0; i != nc; ++i) {
>   out_int[i] = NA_INTEGER;
> }
>   } else {
> for (int i = 0; i != nr; ++i) {
>   out_int[i] = int_mat_int[n - 1 + i * nr];
> }
>   }
>   UNPROTECT(1);
>   return out;}
> 
> SEXP C_make_len2_int_vec(int first, int second) {
>   SEXP out = PROTECT(Rf_allocVector(INTSXP, 2));
>   int *out_int = INTEGER(out);
>   out_int[0] = first;
>   out_int[1] = second;
>   UNPROTECT(1);
>   return out;}
> 
> SEXP C_fullocate(SEXP int_mat) {
>   int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat);
>   int last, row_num;  // row_num will be 1-indexed
>   SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));
>   SEXP prlst = PROTECT(Rf_list1(prlst0cdr));
>   SEXP prlst_tail = prlst;
>   last = INTEGER(prlst0cdr)[1];
>   row_num = 2;
>   while (row_num <= nr) {
> Rprintf("row_num: %i\n", row_num);
> SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num));
> Rf_PrintValue(prlst);  // This is where the error occurs
> int *row_int = INTEGER(row);
> if (row_int[0] == last + 1) {
>   Rprintf("here1");
>   SEXP next = PROTECT(Rf_list1(row));
>   prlst_tail = SETCDR(prlst_tail, next);
>   last = row_int[1];
>   UNPROTECT(1);
>   ++row_num;
> } else {
>   Rprintf("here2");
>   SEXP next_car = PROTECT(C_make_len2_int_vec(last + 1, row_int[0] - 1));
>   SEXP next = PROTECT(Rf_list1(next_car));
>   prlst_tail = SETCDR(prlst_tail, next);
>   last = row_int[0] - 1;
>   UNPROTECT(2);
> }
> UNPROTECT(1);
>   }
>   SEXP out = PROTECT(Rf_PairToVectorList(prlst));
>   UNPROTECT(3);
>   return out;}
> 
> As you can see, I have some diagnostic print statements in there. The
> offending line is line 40, which I have marked with a comment of // This is
> where the error occurs. I have a minimal reproducible package at
> https://github.com/rorynolan/testpkg  
> and I have run R CMD CHECK with
> valgrind using GitHub actions, the results of which are at
> https://github.com/rorynolan/testpkg/runs/1076595757?check_suite_focus=true 
> .
> That's where I found out which line is causing the error. This function
> works as expected sometimes, and then sometimes this issue appears. This
> lends weight to the suspicion that it's a garbage collection issue.
> 
> I really want to know what my mistake is. I'm not that interested in
> alternative implementations; I want to understand the mistake that I'm
> making so that I can avoid making it in future.
> 
> I have asked the question on stackoverflow to little avail, but the
> discussion there may prove helpful.
> https://stackoverflow.com/questions/63759604/garbage-collection-of-seemingly-protected-pairlist
>  
> .
> 
> 
> 
> Thanks,
> 
> Rory
> 
> [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel 
> 
> 
> 
> -- 
> Iñaki Úcar


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel