Hey all, I've started working on this idea, and before getting too far I'd like to get general feedback on the feature before going too far. I'd specifically like to propose a new built-in called `defer` which acts like `eval` however is not parsed/expanded/run until it's scope is leaving. Hopefully "scope" is the correct word, I'm imagining it running at the same time a local would go out of "scope" and be no longer available (just before the locals are cleared, so locals can still be used in expansion). The main purpose of defer is to help with resource management, and more specifically cleanup.
Today cleaning up resources in scripts whether they be files, virtual-machines/containers, or even global state can be challenging for a variety of reasons. It can be very easy to leave extra state/processes running that you may not mean to. Let's take a look first at handling cleanup while "error mode", aka `set -e` is on (we'll cover error mode being off later below, but we'll start with error mode on. Not only because defer works better here, but also because I think many scripts I write want error mode on as manually checking every command for failure can be tedious.). Today there exists four main ways of handling errors with error mode being on: 1. Introduce another function that "wraps" the previous one, and is capable of cleaning up resources. Then hoping no one calls the internal one, maybe even by giving it a scary name like: `__do_not_use_this_unless_you_want_to_do_cleanup_manually_which_you_better_internal_fn_name()`. 2. Push responsibility onto the caller of the function, by having users manually needing to call a cleanup function afterwards. Meaning just calling: `my_function` is incorrect, and callers need to write: `my_function || { cleanup_function; return 1; }`. 3. Don't add complexity to the caller/wrap in a function, but push complexity onto the author of the function itself by manually adding `|| { cleanup; return 1; }` after every command in the function. 4. Don't attempt to clean up the resource at all. If #4 isn't a viable option, or it is but you'd just prefer not to do it, you're left with three options that each either add significant cognitive complexity, or the chance for misuse (or both!). This is where defer comes in, solving the issue of "cleanup" without actually introducing the chance for missing a cleanup through misuse. A very over-simplified, contrived example is below: ``` #!/usr/bin/env bash set -eo pipefail my_function() { local -r tmp_dir=$(mktemp -d) defer rm -r "${tmp_dir}" value=$(command-that-could-fail --save-state "${tmp_dir}/state") if [ "$value" = "success" ]; then could-fail-two --input "$(< ${tmp_dir}/state)" could-fail-three | pipe echo "commands succeeded" else echo "critical failure exiting entire process" exit 1 fi return 0 } ``` In this case no matter how this function exits where there's a problem with a pipe, a command failing, exiting the entire process, or a simple return out successfully; The resource is guaranteed to be cleaned up, assuming rm doesn't fail -- if it did it would clobber the return code to 1 in this case, even on a return of 0. If your script is running with error mode off on purpose, the benefits drastically fall down to just potential easier readability. Rather then needing to create a cleanup function where validation that cleanup is correct, you can co-locate cleanup with the creation of each item. This could make it very easy to validate multi-step cleanups. No longer do you have to open the cleanup function, and the regular function side by side to validate correctness. Take for example the error mode case I mentioned earlier in this paragraph: ``` #!/usr/bin/env bash scoped_error_mode() { if ! echo -n "$SHELLOPTS" | grep 'errexit' >/dev/null 2>&1; then echo "error mode off, enabling for this function" set -e defer set +e fi if ! echo -n "$SHELLOPTS" | grep 'pipefail' >/dev/null 2>&1; then echo "pipefail off, enabling for this function" set -o pipefail defer set +o pipefail fi my_commands my_other_commands | piped-to } ``` Here not only can we scope normally global states to a single function (allowing us to user error mode just where it might be useful, and not everywhere), but as you can see the defer's are directly next to where they are created which means we don't have to save to variables whether or not we need to "turn things back off" again. This at least for most people I think makes it significantly easier to read. The help for the built-in I've been working on looks like: ``` defer: defer [-l] or defer [-d offset] or defer [arg ...] Execute arguments as a shell command when the current scope exists. Queue up a statement to be eval'd when a scope is left. Runs directly before locals in the same scope get cleared. Deferred statements are run in a last in first out order. Options: -d offset delete the defer entry at position OFFSET. Negative offsets count back from the end of the defer list -l list all the active commands being deferred Exit Status: Returns success unless an invalid option is given or an error occurs. ``` Defer can be simulated roughly with functrace similar to how local can be: https://gist.githubusercontent.com/Mythra/de8cdbfdb2b80496b9047b14dffefeb5/raw/91d6599ade2f575ea2d12c2b29e9a6cb829de744/defer-only.sh . It doesn't have any of the listing/delete'ing functionality (but there's no reason it couldn't). This was meant as a very quick rough PoC so the thought could be played with by others. Since I imagine adding a new feature akin to local will be quite the discussion. Curious to hear your thoughts on this feature, and if there's a place for it in Bash. Thanks, Cynthia