Two years ago, Edward Lee at Berkeley wrote an excellent paper called "The Problem With Threads." In his paper he emphasizes that determinism and composability should be the two major objectives when programming concurrent systems. He describes patterns like MapReduce as "coordination languages" and believes that this model is a fruitful route for parallel programming. He arguest that systems that attempt to simplify multithreading "merely chip away at the unnecessarily enormous nondeterminism of the threading model" and that the threading model is "intrinsically intractable." The final paragraph of his essay serves as a manifesto for those of us designing non-threaded frameworks for parallel computing:
If we expect concurrent programming to be mainstream, and if we demand reliability and predictability from programs, then we must discard threads as a programming model. Concurrent programming models can be constructed that are much more predictable and understandable than threads. They are based on a very simple principle: deterministic ends should be accomplished with deterministic means. Nondeterminism should be judiciously and carefully introduced where needed, and should be explicit in programs. This principle seems obvious, yet it is not accomplished by threads. Threads must be relegated to the engine room of computing, to be suffered only by expert technology providers.Threads aren't just harmful because of non-determinism and interlocking issues. Threads are harmful for the same reason Edsger Dijkstra argued Goto is harmful. It's the same reason John Backus apologized for Fortran in his Turing award lecture. The serial instruction-stream control-flow style of traditional computer programming is the wrong way to approach most electronic data processing systems!
The main point of this blog is to argue that any-kind-of-threading is a folly (even single threading) and to put forward the spreadsheet as a better hardware primitive. While hardware realities are forcing the deprecation of the single-threaded instruction stream primitive model, we should deprecate the rest of the von Nuemann model while we're at it. We should avoid the perils of multithreading and transition to a dataflow-centric primitive computing model premised around arrays of reconfigurable dataflow units. Locality constrained dataflow is how electrical engineers design physically realizable systems and programmers will have to follow suit to design concurrent code. As we scale to ever more cores, multi-CPUs will start to resemble FPGAs and addressing the inter-process communication issues will require a more physical dataflow view of the system. (Electrical Engineers don't believe in magic. Computer Scientists don't believe in physics.)
Indeed, part of the folly of DIY dataflow graph (DFG) engines is writing your own multithreaded code to run DFGs on multicore and single threaded executers -- not to discount the NP-Hard problems related to heterogeneous partitioning and FPGA compilation. Let your dataflow framework provider worry about using multithreading platforms to run your DFGs so you can just keep using spreadsheets.