Re: [Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
On Tue, Dec 20, 2016 at 7:04 AM, Henrik Bengtsson wrote: > On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some > packages don't unload their DLLs when they being unloaded themselves. I am surprised by this. Why does R not do this automatically? What is the case for keeping the DLL loaded after the package has been unloaded? What happens if you reload another version of the same package from a different library after unloading? __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Very small numbers in hexadecimal notation parsed as zero
Hi all, I have noticed incorrect parsing of very small hexadecimal numbers like "0x1.dp-987". Such a hexadecimal representation can can be produced by sprintf() using the %a flag. The return value is incorrectly reported as 0 when coercing these numbers to double using as.double()/as.numeric(), as illustrated in the three examples below: as.double("0x1.dp-987")# should be 7.645296e-298 as.double("0x1.0p-1022") # should be 2.225074e-308 as.double("0x1.f89fc1a6f6613p-974") # should be 1.23456e-293 The culprit seems to be the src/main/util.c:R_strtod function and in some cases, removing the zeroes directly before the 'p' leads to correct parsing: as.double("0x1.dp-987") # 7.645296e-298, as expected as.double("0x1.p-1022") # 2.225074e-308, as expected I wrote a small program (in a file called "strtod.c") to compare the R stdtod implementation to a C implementation. The C implementation never reported 0 in the examples given above: #include #include int main(void) { char *string, *stopstring; double x; string = "0x1.dp-987"; x = strtod(string, &stopstring); printf("string = \"%s\"\n", string); printf("strtod = %.17g\n\n", x); string = "0x1.dp-987"; x = strtod(string, &stopstring); printf("string = \"%s\"\n", string); printf("strtod = %.17g\n\n", x); } $ gcc -o strtod.exe strtod.c $ ./strtod.exe string = "0x1.dp-987" strtod = 7.6452955642246671e-298 string = "0x1.dp-987" strtod = 7.6452955642246671e-298 string = "0x1.0p-1022" strtod = 2.2250738585072014e-308 string = "0x1.p-1022" strtod = 2.2250738585072014e-308 string = "0x1.f89fc1a6f6613p-974" strtod = 1.23456e-293 My sessionInfo() returns: R version 3.3.2 (2016-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C LC_TIME=German_Switzerland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Regards, Florent __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Unexpected I(NULL) output
Hi all, I believe there is an issue with passing NULL to the function I(). class(NULL) # "NULL" (as expected) print(NULL) # NULL (as expected) is.null(NULL) # TRUE (as expected) According to the documentation I() should return a copy of its input with class "AsIs" preprended: class(I(NULL)) # "AsIs" (as expected) print(I(NULL)) # list() (not expected! should be NULL) is.null(I(NULL)) # FALSE (not expected! should be TRUE) So, I() does not behave according to its documentation. In R, it is not possible to give NULL attributes, but I(NULL) attempts to do that nonetheless, using the structure() function. Probably: 1/ structure() should not accept NULL as input since the goal of structure() is to set some attributes, something cannot be done on NULL. 2/ I() could accept NULL, but, as an exception, not set an "AsIs" class attribute on it. This would be in line with the philosophy of the I() function to return an object that is functionally equivalent to the input object. My sessionInfo() returns: > sessionInfo() R version 3.3.2 (2016-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C [5] LC_TIME=German_Switzerland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Best regards, Florent __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
It's not always clear when it's safe to remove the DLL. The main problem that I'm aware of is that native objects with finalizers might still exist (created by R_RegisterCFinalizer etc). Even if there are no live references to such objects (which would be hard to verify), it still wouldn't be safe to unload the DLL until a full garbage collection has been done. If the DLL is unloaded, then the function pointer that was registered now becomes a pointer into the memory where the DLL was, leading to an almost certain crash when such objects get garbage collected. A better approach would be to just remove the limit on the number of DLLs, dynamically expanding the array if/when needed. On Tue, Dec 20, 2016 at 3:40 AM, Jeroen Ooms wrote: > On Tue, Dec 20, 2016 at 7:04 AM, Henrik Bengtsson > wrote: >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some >> packages don't unload their DLLs when they being unloaded themselves. > > I am surprised by this. Why does R not do this automatically? What is > the case for keeping the DLL loaded after the package has been > unloaded? What happens if you reload another version of the same > package from a different library after unloading? > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
> Steve Bronder > on Tue, 20 Dec 2016 01:34:31 -0500 writes: > Thanks Henrik this is very helpful! I will try this out on our tests and > see if gcDLLs() has a positive effect. > mlr currently has tests broken down by learner type such as classification, > regression, forecasting, clustering, etc.. There are 83 classifiers alone > so even when loading and unloading across learner types we can still hit > the MAX_NUM_DLLS error, meaning we'll have to break them down further (or > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff and Bernd > Bischl to make sure I am representing the issue well. This came up *here* in May 2015 and then May 2016 ... did you not find it when googling. Hint: Use site:stat.ethz.ch MAX_NUM_DLLS as search string in Google, so it will basically only search the R mailing list archives Here's the start of that thread : https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html There was not a clear conclusion back then, notably as Prof Brian Ripley noted that 100 had already been an increase and that a large number of loaded DLLs decreases look up speed. OTOH (I think others have noted that) a large number of DLLs only penalizes those who *do* load many, and we should probably increase it. Your use case of "hyper packages" which load many others simultaneously is somewhat convincing to me... in so far as the general feeling is that memory should be cheap and limits should not be low. (In spite of Brian Ripleys good reasons against it, I'd still aim for a *dynamic*, i.e. automatically increased list here). Martin Maechler > Regards, > Steve Bronder > Website: stevebronder.com > Phone: 412-719-1282 > Email: sbron...@stevebronder.com > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson < > henrik.bengts...@gmail.com> wrote: >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some >> packages don't unload their DLLs when they being unloaded themselves. >> In other words, there may be left-over DLLs just sitting there doing >> nothing but occupying space. You can remove these, using: >> >> R.utils::gcDLLs() >> >> Maybe that will help you get through your tests (as long as you're >> unloading packages). gcDLLs() will look at base::getLoadedDLLs() and >> its content and compare to loadedNamespaces() and unregister any >> "stray" DLLs that remain after corresponding packages have been >> unloaded. >> >> I think it would be useful if R CMD check would also check that DLLs >> are unregistered when a package is unloaded >> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of >> course, someone needs to write the code / a patch for this to happen. >> >> /Henrik >> >> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder >> wrote: >> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to >> 500. >> > >> > On line 131 of Rdynload.c, changing >> > >> > #define MAX_NUM_DLLS 100 >> > >> > to >> > >> > #define MAX_NUM_DLLS 500 >> > >> > >> > In development of the mlr package, there have been several episodes in >> the >> > past where we have had to break up unit tests because of the "maximum >> > number of DLLs reached" error. This error has been an inconvenience that >> is >> > going to keep happening as the package continues to grow. Is there more >> > than meets the eye with this error or would everything be okay if the >> above >> > line changes? Would that have a larger effect in other parts of R? >> > >> > As R grows, we are likely to see more 'meta-packages' such as the >> > Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded >> at >> > any point in time to conduct effective unit tests. If MAX_NUM_DLLS is >> set >> > to 100 for a very particular reason than I apologize, but if it is >> possible >> > to increase MAX_NUM_DLLS it would at least make the testing at mlr much >> > easier. >> > >> > I understand you are all very busy and thank you for your time. >> > >> > >> > Regards, >> > >> > Steve Bronder >> > Website: stevebronder.com >> > Phone: 412-719-1282 >> > Email: sbron...@stevebronder.com >> > >> > [[alternative HTML version deleted]] >> > >> > __ >> > R-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > [[alternative HTML version deleted]] > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
On 20 December 2016 at 17:40, Martin Maechler wrote: | > Steve Bronder | > on Tue, 20 Dec 2016 01:34:31 -0500 writes: | | > Thanks Henrik this is very helpful! I will try this out on our tests and | > see if gcDLLs() has a positive effect. | | > mlr currently has tests broken down by learner type such as classification, | > regression, forecasting, clustering, etc.. There are 83 classifiers alone | > so even when loading and unloading across learner types we can still hit | > the MAX_NUM_DLLS error, meaning we'll have to break them down further (or | > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff and Bernd | > Bischl to make sure I am representing the issue well. | | This came up *here* in May 2015 | and then May 2016 ... did you not find it when googling. | | Hint: Use |site:stat.ethz.ch MAX_NUM_DLLS | as search string in Google, so it will basically only search the | R mailing list archives | | Here's the start of that thread : | | https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html | | There was not a clear conclusion back then, notably as | Prof Brian Ripley noted that 100 had already been an increase | and that a large number of loaded DLLs decreases look up speed. | | OTOH (I think others have noted that) a large number of DLLs | only penalizes those who *do* load many, and we should probably | increase it. | | Your use case of "hyper packages" which load many others | simultaneously is somewhat convincing to me... in so far as the | general feeling is that memory should be cheap and limits should | not be low. | | (In spite of Brian Ripleys good reasons against it, I'd still | aim for a *dynamic*, i.e. automatically increased list here). Yes. Start with 10 or 20, add 10 as needed. Still fast in the 'small N' case and no longer a road block for the 'big N' case required by mlr et al. As a C++ programmer, I am now going to hug my std::vector and quietly retreat. Dirk | Martin Maechler | | > Regards, | | > Steve Bronder | > Website: stevebronder.com | > Phone: 412-719-1282 | > Email: sbron...@stevebronder.com | | | > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson < | > henrik.bengts...@gmail.com> wrote: | | >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some | >> packages don't unload their DLLs when they being unloaded themselves. | >> In other words, there may be left-over DLLs just sitting there doing | >> nothing but occupying space. You can remove these, using: | >> | >> R.utils::gcDLLs() | >> | >> Maybe that will help you get through your tests (as long as you're | >> unloading packages). gcDLLs() will look at base::getLoadedDLLs() and | >> its content and compare to loadedNamespaces() and unregister any | >> "stray" DLLs that remain after corresponding packages have been | >> unloaded. | >> | >> I think it would be useful if R CMD check would also check that DLLs | >> are unregistered when a package is unloaded | >> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of | >> course, someone needs to write the code / a patch for this to happen. | >> | >> /Henrik | >> | >> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder | >> wrote: | >> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to | >> 500. | >> > | >> > On line 131 of Rdynload.c, changing | >> > | >> > #define MAX_NUM_DLLS 100 | >> > | >> > to | >> > | >> > #define MAX_NUM_DLLS 500 | >> > | >> > | >> > In development of the mlr package, there have been several episodes in | >> the | >> > past where we have had to break up unit tests because of the "maximum | >> > number of DLLs reached" error. This error has been an inconvenience that | >> is | >> > going to keep happening as the package continues to grow. Is there more | >> > than meets the eye with this error or would everything be okay if the | >> above | >> > line changes? Would that have a larger effect in other parts of R? | >> > | >> > As R grows, we are likely to see more 'meta-packages' such as the | >> > Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded | >> at | >> > any point in time to conduct effective unit tests. If MAX_NUM_DLLS is | >> set | >> > to 100 for a very particular reason than I apologize, but if it is | >> possible | >> > to increase MAX_NUM_DLLS it would at least make the testing at mlr much | >> > easier. | >> > | >> > I understand you are all very busy and thank you for your time. | >> > | >> > | >> > Regards, | >> > | >> > Steve Bronder | >> > Website: stevebronder.com | >> > Phone: 412-719-1282 | >> > Email: sbron...@stevebronder.com | >> > | >> > [[alternati
Re: [Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
Hi, Dirk: On 12/20/2016 10:56 AM, Dirk Eddelbuettel wrote: On 20 December 2016 at 17:40, Martin Maechler wrote: | > Steve Bronder | > on Tue, 20 Dec 2016 01:34:31 -0500 writes: | | > Thanks Henrik this is very helpful! I will try this out on our tests and | > see if gcDLLs() has a positive effect. | | > mlr currently has tests broken down by learner type such as classification, | > regression, forecasting, clustering, etc.. There are 83 classifiers alone | > so even when loading and unloading across learner types we can still hit | > the MAX_NUM_DLLS error, meaning we'll have to break them down further (or | > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff and Bernd | > Bischl to make sure I am representing the issue well. | | This came up *here* in May 2015 | and then May 2016 ... did you not find it when googling. | | Hint: Use |site:stat.ethz.ch MAX_NUM_DLLS | as search string in Google, so it will basically only search the | R mailing list archives | | Here's the start of that thread : | | https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html | | There was not a clear conclusion back then, notably as | Prof Brian Ripley noted that 100 had already been an increase | and that a large number of loaded DLLs decreases look up speed. | | OTOH (I think others have noted that) a large number of DLLs | only penalizes those who *do* load many, and we should probably | increase it. | | Your use case of "hyper packages" which load many others | simultaneously is somewhat convincing to me... in so far as the | general feeling is that memory should be cheap and limits should | not be low. | | (In spite of Brian Ripleys good reasons against it, I'd still | aim for a *dynamic*, i.e. automatically increased list here). Yes. Start with 10 or 20, add 10 as needed. Still fast in the 'small N' case and no longer a road block for the 'big N' case required by mlr et al. As a C++ programmer, I am now going to hug my std::vector and quietly retreat. May I humbly request a translation of "std::vector" for people like me who are not familiar with C++? I got the following: > install.packages('std') Warning in install.packages : package ‘std’ is not available (for R version 3.3.2) Thanks, Spencer Graves Dirk | Martin Maechler | | > Regards, | | > Steve Bronder | > Website: stevebronder.com | > Phone: 412-719-1282 | > Email: sbron...@stevebronder.com | | | > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson < | > henrik.bengts...@gmail.com> wrote: | | >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some | >> packages don't unload their DLLs when they being unloaded themselves. | >> In other words, there may be left-over DLLs just sitting there doing | >> nothing but occupying space. You can remove these, using: | >> | >> R.utils::gcDLLs() | >> | >> Maybe that will help you get through your tests (as long as you're | >> unloading packages). gcDLLs() will look at base::getLoadedDLLs() and | >> its content and compare to loadedNamespaces() and unregister any | >> "stray" DLLs that remain after corresponding packages have been | >> unloaded. | >> | >> I think it would be useful if R CMD check would also check that DLLs | >> are unregistered when a package is unloaded | >> (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29), but of | >> course, someone needs to write the code / a patch for this to happen. | >> | >> /Henrik | >> | >> On Mon, Dec 19, 2016 at 6:01 PM, Steve Bronder | >> wrote: | >> > This is a request to increase MAX_NUM_DLLS in Rdynload.c in from 100 to | >> 500. | >> > | >> > On line 131 of Rdynload.c, changing | >> > | >> > #define MAX_NUM_DLLS 100 | >> > | >> > to | >> > | >> > #define MAX_NUM_DLLS 500 | >> > | >> > | >> > In development of the mlr package, there have been several episodes in | >> the | >> > past where we have had to break up unit tests because of the "maximum | >> > number of DLLs reached" error. This error has been an inconvenience that | >> is | >> > going to keep happening as the package continues to grow. Is there more | >> > than meets the eye with this error or would everything be okay if the | >> above | >> > line changes? Would that have a larger effect in other parts of R? | >> > | >> > As R grows, we are likely to see more 'meta-packages' such as the | >> > Hadley-verse, caret, mlr, etc. need an increasing amount of DLLs loaded | >> at | >> > any point in time to conduct effective unit tests. If MAX_NUM_DLLS is | >> set | >> > to 100 for a very particular reason than I apologize, but if it is | >> possible | >> > to increase MAX_NUM_DLLS it would at least make the testing at mlr much | >> > ea
Re: [Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
See inlin e On Tue, Dec 20, 2016 at 12:14 PM, Spencer Graves < spencer.gra...@prodsyse.com> wrote: > Hi, Dirk: > > > > On 12/20/2016 10:56 AM, Dirk Eddelbuettel wrote: > >> On 20 December 2016 at 17:40, Martin Maechler wrote: >> | > Steve Bronder >> | > on Tue, 20 Dec 2016 01:34:31 -0500 writes: >> | >> | > Thanks Henrik this is very helpful! I will try this out on our >> tests and >> | > see if gcDLLs() has a positive effect. >> | >> | > mlr currently has tests broken down by learner type such as >> classification, >> | > regression, forecasting, clustering, etc.. There are 83 >> classifiers alone >> | > so even when loading and unloading across learner types we can >> still hit >> | > the MAX_NUM_DLLS error, meaning we'll have to break them down >> further (or >> | > maybe we can be clever with gcDLLs()?). I'm CC'ing Lars Kotthoff >> and Bernd >> | > Bischl to make sure I am representing the issue well. >> | >> | This came up *here* in May 2015 >> | and then May 2016 ... did you not find it when googling. > > | >> | Hint: Use >> |site:stat.ethz.ch MAX_NUM_DLLS >> | as search string in Google, so it will basically only search the >> | R mailing list archives >> > I did not know this and apologize. I starred this email so I can use it next time I have a question or request. I did find (and left a comment) on the stackoverflow question in which you left an answer to this question. http://stackoverflow.com/a/37021455/2269255 > | >> | Here's the start of that thread : >> | >> | >> >> >> https://stat.ethz.ch/pipermail/r-devel/2016-May/072637.html >> | >> | There was not a clear conclusion back then, notably as >> | Prof Brian Ripley noted that 100 had already been an increase >> | and that a large number of loaded DLLs decreases look up speed. > > | >> | OTOH (I think others have noted that) a large number of DLLs >> | only penalizes those who *do* load many, and we should probably >> | increase it. >> > Am I correct in understanding that the decrease in lookup speed only happens when a large number of DLLs are loaded? If so, this is an expected cost to having many DLLs and one that I, and I would guess other developers, would be willing to pay to have more DLLs available. If increasing MAX_NUM_DLLS would increase R's fixed memory footprint a significant amount then I think that's a reasonable argument against the increase in MAX_NUM_DLLS. > | >> | Your use case of "hyper packages" which load many others >> | simultaneously is somewhat convincing to me... in so far as the >> | general feeling is that memory should be cheap and limits should >> | not be low. >> > It should also be pointed out that even in the case of "hyper packages" like mlr, this is only an issue during unit testing. I wonder if there is some middle ground here? Would it be difficult to have a compile flag that would change the number of MAX_NUM_DLLS when compiling R from source? I believe this would allow us to increase MAX_NUM_DLLS when testing in Travis and Jenkins while keeping the same footprint for regular users. > | >> | (In spite of Brian Ripleys good reasons against it, I'd still >> | aim for a *dynamic*, i.e. automatically increased list here). >> >> Yes. Start with 10 or 20, add 10 as needed. Still fast in the 'small N' >> case and no longer a road block for the 'big N' case required by mlr et >> al. >> > This would be nice! Though my concern is the R-core team's time. This is the best answer, but I don't feel comfortable requesting it because I can't help with this and do not want to take up R-core's time without a very significant reason. Unit testing for a meta-package is a particular case, though I think an important one which will impact R over the long term. The answers from least to most complex are something like: 1. Do nothing 2. Increase MAX_NUM_DLLS 3. Compiler flag for MAX_NUM_DLLS ( I actually have no reference to how difficult this would be) 4. Change to dynamic loading I'm requesting (2) because I think it's a simple short term answer until someone has time to sit down and work out (4). > >> As a C++ programmer, I am now going to hug my >> >> std::vector and quietly retreat. >> > > > May I humbly request a translation of "std::vector" for people like me who > are not familiar with C++? > > > I got the following: > > > > install.packages('std') > Warning in install.packages : > package ‘std’ is not available (for R version 3.3.2) > > > Thanks, > Spencer Graves > > >> Dirk >> >> | Martin Maechler >> | >> | > Regards, >> | >> | > Steve Bronder >> | > Website: stevebronder.com >> | > Phone: 412-719-1282 >> | > Email: sbron...@stevebronder.com >> | >> | >> | > On Tue, Dec 20, 2016 at 1:04 AM, Henrik Bengtsson < >> | > henrik.bengts...@gmail.com> wrote: >> | >> | >> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because >> some >> | >> packages don't unload their DLLs when th