Luke,
sure, adjustment at run-time works just fine, the issue currently is that it is baked-in at compile time so there is no way to adjust it (re-building R is not an option in production environment where this usually happens). That said, I'm still not sure that connection limit is a good way to guard against the fd limit since there are so many other ways to use up descriptors (DLLs, sockets, pipes, etc. - packages and 3rd party libraries). Apparently we are actually already fiddling with the soft limit - we have R_EnsureFDLimit() and R_GetFDLimit() which is used at startup to raise it to 1024 by default regardless of the ulimit -n setting (comments say this is for DLLs). I guess based on that we know at least what to expect so we could trivially warn if the new setting is larger that the user limit. Cheers, Simon > On Aug 25, 2021, at 1:45 PM, luke-tier...@uiowa.edu wrote: > > We do need to be careful about using too many file descriptors. The > standard soft limit on Linux is fairly low (1024; the hard limit is > usually quite a bit higher). Hitting that limit, e.g. with runaway > with code allocating lots of connections, can cause other things, like > loading packages, to fail with hard to diagnose error messages. A > static connection limit is a crude way to guard against that. Doing > anything substantially better is probably a lot of work. A simple > option that may be worth pursuing is to allow the limit to be adjusted > at runtime. Users who want to go higher would do so at their own risk > and may need to know how to adjust the soft limit on the process. > > Best, > > luke > > On Wed, 25 Aug 2021, Simon Urbanek wrote: > >> >> Martin, >> >> I don't think static connection limit is sensible. Recall that connections >> can be anything, not just necessarily sockets or file descriptions so they >> are not linked to the system fd limit. For example, if you use a codec then >> you will need twice the number of connections than the fds. To be honest the >> connection limit is one of the main reasons why in our big data applications >> we have always avoided R connections and used C-level sockets instead >> (others were lack of control over the socket flags, but that has been >> addressed in the last release). So I'd vote for at the very least increasing >> the limit significantly (at least 1k if not more) and, ideally, make it >> dynamic if memory footprint is an issue. >> >> Cheers, >> Simon >> >> >>> On Aug 25, 2021, at 8:53 AM, Martin Maechler <maech...@stat.math.ethz.ch> >>> wrote: >>> >>>>>>>> GILLIBERT, Andre >>>>>>>> on Tue, 24 Aug 2021 09:49:52 +0000 writes: >>> >>>> RConnection is a pointer to a Rconn structure. The Rconn >>>> structure must be allocated independently (e.g. by >>>> malloc() in R_new_custom_connection). Therefore, >>>> increasing NCONNECTION to 1024 should only use 8 >>>> kilobytes on 64-bits platforms and 4 kilobytes on 32 >>>> bits platforms. >>> >>> You are right indeed, and I was wrong. >>> >>>> Ideally, it should be dynamically allocated : either as >>>> a linked list or as a dynamic array >>>> (malloc/realloc). However, a simple change of >>>> NCONNECTION to 1024 should be enough for most uses. >>> >>> There is one important other problem I've been made aware >>> (similarly to the number of open DLL libraries, an issue 1-2 >>> years ago) : >>> >>> The OS itself has limits on the number of open files >>> (yes, I know that there are other connections than files) and >>> these limits may quite differ from platform to platform. >>> >>> On my Linux laptop, in a shell, I see >>> >>> $ ulimit -n >>> 1024 >>> >>> which is barely conformant with your proposed 1024 NCONNECTION. >>> >>> Now if NCONNCECTION is larger than the max allowed number of >>> open files and if R opens more files than the OS allowed, the >>> user may get quite unpleasant behavior, e.g. R being terminated brutally >>> (or behaving crazily) without good R-level warning / error messages. >>> >>> It's also not at all sufficient to check for the open files >>> limit at compile time, but rather at R process startup time >>> >>> So this may need considerably more work than you / we have >>> hoped, and it's probably hard to find a safe number that is >>> considerably larger than 128 and less than the smallest of all >>> non-crazy platforms' {number of open files limit}. >>> >>>> Sincerely >>>> Andr� GILLIBERT >>> >>> [............] >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tier...@uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel