micahlerner.com

Unix Shell Programming: The Next 50 Years (The Future of the Shell, Part I)

Published July 14, 2021

Found something wrong? Submit a pull request!

Unix Shell Programming: The Next 50 Years

This week’s paper won the distinguished presentation award at HotOS 2021, and discusses the potential for future innovation in the tool that many use every day - the shell!A previous submission of this paper on Hacker News elicited a number of strong reactions. One reaction was the assertion that there are in fact modern shells - Elvish features most prominently. While I think several of the comments on the original posting are in the right direction, my takeaway from this paper was that modern shells could take advantage of exciting advances in other areas of systems research (in particular, data flow and transparent parallelization of computation).

It not only proposes a way forward to address (what the authors view) as the shell’s sharp edges, but also references a number of other interesting papers that I will publish paper reviews of over the next few weeks - the gist of several of the papers are mentioned further on in this article:

The good, the bad, and the ugly

In Unix Shell Programming: The Next 50 Years, the authors argue that while the shell is a powerful tool, it can be improved for modern users and workflows. To make this argument, the paper first considers “the good, the bad, and the ugly” of shells in order to outline what should (or should not) change in shells going forward.

The paper identifies four good components of modern shells:

Universal composition: The shell already prioritizes chaining small programs working in concert (which can be written in many different languages), according to the Unix philosophy.
Stream processing: The shell is well structured to perform computation that flows from one command to another through pipes (for example, using xargs). The paradigm of stream processing is an active area of research outside of the shell and shows up in modern distributed systems like Apache Flink or Spark Streaming.
Unix-native: “The features and abstractions of the shell are well suited to the Unix file system and file-based abstractions. Unix can be viewed as a naming service, mapping strings to longer strings, be it data files or programs”
Interactive: A REPL-like environment for interacting with your system translates into user efficiency.

Next - four bad features are detailed, with the note that, “It’s hard to imagine ‘addressing’ these characteristics without turning the shell into something it isn’t; it’s hard to get the good of the shell without these bad qualities”As an example, the paper links to previous research that word expansion (“the conversion of user input into…a command and its arguments”) make up a significant portion of user commands. :

Too arbitrary: Almost any command can be executed as part of a shell pipelineShell tetris! . While this flexibility is useful for interacting with many different components (each of which may be in a different language), the arbitrariness of a shell makes formalizing a shell’s behavior significantly more difficult.
Too dynamic: Shell behavior can depend on runtime execution state, making analysis of shell scripts more difficult (analysis techniques could be helpful for determining undesirable outcomes of shell scripts before running them).
Too obscure: There is a 300 page specification for the POSIX shell, in addition to test suites. Unfortunately, the authors found multiple issues with common shells, and even with the test suites themselves! The undefined nature of what a shell is actually supposed to do in specific situations means that it is hard to make guarantees about correctnessOne of the author’s papers goes more in-depth on the question of ‘What is the POSIX shell?’ .

Lastly, four ugly components are detailed:

Error proneness: There aren’t checks to prevent a user from making mistakes (which could have drastic conditions). Unix/Linux Horror Stories has some good ones (or bad, if you were the person making the mistake!).
Performance doesn’t scale: the shell isn’t set up to parallelize trivially parallelize problems across many cores or machines (which would be very helpful in a modern environment)If this is interesting to you, predominantly all of the papers in the series deal with this problem. .
Redundant recomputation: If a developer makes a change to a shell script, they will have to rerun it in its entirety (unless they are a shell wizard and have gone out of their way to ensure that their script does not do so, while potentially making operations idempotent).
No support for contemporary deployments: Similar to the 2nd point - most shell scripts aren’t designed to take advantage of multiple machines, nor of cloud deployments.

Enabling the shell to move forward

The paper next argues that two sets of recent academic research are enabling the shell to move forward: formalizing the shell and annotation languages.

Recent work on formalizing the shell is detailed in Executable Formal Semantics for the POSIX Shell, which has two major components: Smoosh and libdash - the artifacts for both are open source.

Smoosh is an executable shell specification written in LemWhich can then be translated to different formats, including proof languages like Coq . A shell specification written in code (versus the extensive written specification) meant that the aforementioned paper was able to test various shells for undefined behavior, in the process finding several bugs in implementation (not to mention, bugs in the test suite for the POSIX shell specification!)Another interesting feature of Smoosh is that it provides two interfaces to interact with the OS - one actually invokes syscalls, whereas the other mode simulates syscalls (and is used for symbolic execution). This vaguely reminds me of the testing system used in FoundationDB, covered in a previous paper review. . libdash transforms shell scripts from (or to) abstract syntax trees, and is used by Smoosh.

Annotation languages can allow users to specify how a command runs, in addition to possible inputs and outputs. Strictly specifying a command allows for it to be included as a step (with inputs and outputs) in a data flow graph, enabling more advanced functionality - for example, deciding to divide the inputs of a step across many machines, perform computation in parallel, then coalescing the output. If this type of advanced functionality sounds interesting to you, stay tuned! I’ll be reading about the two papers that fall into this category (PaSH & POSH) over the next few weeks.

After discussing these two research areas, the paper discusses a new project from the authors, called Jash (Just Another SHell). It can act as a shim between the user and the actual execution of a shell command. Eventually, Jash seems like it could implement functionality similar to an execution engine or query planner, evaluating commands at runtime and deciding how to perform the requested work (providing feedback to the user if the script will produce unintended side effects).

The future

The paper outlines five functionalities for the future of the shell:

Distribution: in the context of a shell, this means building a system capable of scaling beyond a single machine (for example, inserting compute resources at different stages of a shell command’s execution to parallelize) - all three of the papers in this series dive deep on this idea.
Incremental support: if a shell script is changed slightly, but can reuse previous computation, a shell could strive to do so.The paper cites Differential Dataflow, which is related to another paper I have had on the backlog for a while - Naiad: A Timely Dataflow System.
Heuristic support: While transforming a shell script into a data flow graph can be facilitated by annotation languages, it would be costly to annotate every shell command. Ideally, the annotation of commands could be performed automatically (or with the support of automation).
User support: A shell should take advantage of modern features like language servers. A formal specification for interacting with the shell can theoretically simplify interactions with the shell.
Formal support: The paper cites how formalization has helped C “tool authors and standards writers”, in particular with respect to undefined behavior. Diving deep on this, I found a few helpful papers that discuss undefined C behavior - in particular this one from Pascal Cuoq and John Regehr).

Conclusion

The shell is an integral part of systems, and this paper makes a case for revisiting the shell’s sharp edges, while revamping its functionality for modern use cases. I’m excited to keep diving deep on this topic - this is the first post in a series I’m doing! If you enjoyed it (or otherwise have suggestions), find me on Twitter. Until next time.

Found something wrong? Submit a pull request!