[Rd] Conventions: Use of globals and main functions

Peter Meissner retep@me|@@ner @end|ng |rom gm@||@com
Tue Aug 27 15:41:55 CEST 2019


Hey,

I always found it a strength of R compared to many other langaugas that
simple things (running a script, doing something interactive, writing a
function, using lambdas, installing packages, getting help, ...) are very
very simple.

R is a commandline statistics program that happens to be a very elegant,
simple and consistent programming language too.

That beeing said I think the main task of scripts is to get things done via
running them end to end in a fresh session. Now, it very well may happen
that a lot of stuff has to be done. Than splitting up scripts into
subscripts and sourcing them from a meta script is a straightforward
solution. It might also be that some functionality is put into functions to
be reused in other places. This can be done by putting those function
definitions into separate files. Than one cane use source wherever those
functions are needed. Now, putting stuff that runs code and scripts that
define/provovide functions into the same script is a bad idea. Using the
main()-idioms described might prevent this the problems stemming from
mixing functions and function execution. But it would also encourage this
mixing which is - I think, a bad idea anyways.

Therefore, I am against fostering a main()-idiom - it adds complexity and
encourages bad code structuring (putting application code and function
definition code into one file).

If one needs code to behave differenlty in interactive sessions than in
non-interactive sessions - if( interactive() ){ } is one way to solve this.

If more solid software developement is needed packages are the way to go.


Best, Peter


Am So., 25. Aug. 2019 um 06:11 Uhr schrieb Cyclic Group Z_1 via R-devel <
r-devel using r-project.org>:

> In R scripts (as opposed to packages), even in reproducible scripts, it
> seems fairly conventional to use the global workspace as a sort of main
> function, and thus R scripts often populate the global environment with
> many variables, which may be mutated. Although this makes sense given R has
> historically been used interactively and this practice is common for
> scripting languages, this appears to disagree with the software-engineering
> principle of avoiding a mutating global state. Although this is just a rule
> of thumb, in R scripts, the frequent use of global variables is much more
> pronounced than in other languages.
>
> On the other hand, in Python, it is common to use a main function (through
> the `def main():` and  `if __name__ == "__main__":` idioms). This is
> mentioned both in the documentation as well as in the writing of Python's
> main creator. Although this is more beneficial in Python than in R because
> Python code is structured into modules, which serve as both scripts and
> packages, whereas R separates these conceptually, a similar practice of
> creating a main function would help avoid the issues from mutating global
> state common to other languages and facilitate maintainability, especially
> for longer scripts.
>
> Although many great R texts (Advanced R, Art of R Programming, etc.)
> caution against assignment in a parent enclosure (e.g., using `<<-`, or
> `assign`), I have not seen many promote the use of a main function and
> avoiding mutating global variables from top level.
>
> Would it be a good idea to promote use of main functions and limiting
> global-state mutation for longer scripts and dedicated applications (not
> one-off scripts)? Should these practices be mentioned in the standard
> documentation?
>
> This question was motivated largely by this discussion on Reddit:
> https://www.reddit.com/r/rstats/comments/cp3kva/is_mutating_global_state_acceptable_in_r/ .
> Apologies beforehand if any of these (partially subjective) assessments are
> in error.
>
> Best,
> CG
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list