Re: [Rd] Conventions: Use of globals and main functions

Cyclic Group Z_1 via R-devel Wed, 28 Aug 2019 09:07:10 -0700

I appreciate the well-thought-out comments.

To your first point, I am not sure what "glattering" means precisely (a Google 
search revealed nothing useful), but I assume it means something to the effect 
of overfilling the main namespace with too many names. Per Norm Matloff's 
counterpoint in The Art of R Programming regarding this issue, this is mostly 
avoided by well-defined, (sufficiently) long names. Also, when a program is 
properly modularized, one generally wouldn't have this many objects at the same 
time unless the complexity of a program demands it. You can, for example, use 
named function scope outside main or anonymous functions to limit variable 
scope to operations that need a given variable. Using main() with any named 
functions closely tied to a script defined outside it actually addresses this 
"glattering namespace" issue, since, if we treat the global scope as a main 
function instead of using a main() idiom, any functions that are defined in 
global scope will contain all global variables within its search path. 
Alternatively, one can put all named functions in a package; in some cases, 
however, it will make more sense to keep a function defined within the script. 
Unless you never modularize your code into functions and flatten everything out 
into a common namespace, using main would be helpful to avoid 
namespace-glattering. Maybe I'm missing something, but I'm not sure how 
namespace-glattering favors not using a main() idiom, since avoiding globals 
doesn't mean not structuring your code properly; it actually seems to favor 
using main(). Given any properly structured program (organizing functions as 
needed), the implementation that puts all variables into the global workspace 
(same as the top-level functions) will be less safe since all functions will 
contain all globals within its search path. (Unless, of course, every single 
function is put into a package).


To your second point, I agree that many of the issues associated with global 
state/environment are generally less problematic when using pure (or as pure as 
possible) functions. On a related note, lexically scoped functional languages 
(especially pure functional ones) generally encourage modularizing everything 
into functions, rather than having a lot of objects exposed to the top level 
(not to say that globals are not used, only that they are not the default 
choice). So the typical R way of doing this tends to disagree with how things 
are normally done in functional programming. Chopping our code into 
well-abstracted functions (and therefore namespaces) is the functional way to 
do things and helps to minimize the state to which any particular function has 
access. Organizing the functions we want to be pure so that they are not 
defined in the same environment in which they are called actually helps to 
ensure function purity in the input direction, since those functions will not 
have lexical-scope access to called variables. (That is, you may have written 
an impure function without realizing it; organizing functions so they are not 
defined in the same environment as when they are called helps to ensure purity.)

Perhaps I am mistaken, but in either case, your points actually favor a main() 
idiom, unless you take using main() to mean using main() with extra bits (e.g., 
flattening your code structure).

Admittedly, putting every single function into a package and not having any 
named functions in your script generally addresses all of these issues. 

Best,
CG

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Conventions: Use of globals and main functions

Reply via email to