Note: you can try this tutorial in Binder.

What is CSP

We have said that aiochan is a CSP-style concurrency library, and we explained what concurrency means. So what is CSP? CSP stands for Communicating Sequential Processes, and understanding what the three words in turn mean will get us a long way towards understanding how to use aiochan.

Let’s begin with the last word: processes. Immediately, we have encountered an opportunity for great confusion: there are simply too many things in computing that are called “processes” at different times, by different people, and meaning subtly different things. Without dwelling on dictionary definitions, let’s agree that in this section, a process just means a group of computer code that executes fairly independently from other codes and from the outside world, or, a group of code that you can mentally think of as a whole entity. The quintessential example is that of a function: a function as a process goes from taking in arguments from the caller, and ends when returning to the caller. A better word might be “task” here, but let’s just stick with the original wording.

So for us, a process is something that is logically like a function, for now. What is a sequential process, then? If you read the word literally, it means that statements or expressions in your function are executed or evaluated in strict order, from top to bottom. Now this is also problematic: we have so-called control statements like while, for, etc., which are useful because they disrupt the sequential flow of your function. However, when your program is running, and all variables and arguments have concrete values, it is true that your statements in a function are executed one by one, in some deterministic order as specified by the control statements in the function. Deterministic is the keyword here: it is possible to predict what happens next by knowing the current state. In this section, we will use the word sequential in this sense: deterministically equivalent to sequential execution.

For example, let’s look at the following snippet:

In [1]:
x = 10
x += 10
x *= 2
x -= 7
print(x)
33

The above calculates ((10 + 10) * 2) - 7 = 33 and is sequential. If your programming language instead calculates ((10 * 2) + 10) - 7 = 3 then you have some serious issues. So sequential programs are good, it is what we as humans expect.

However, it is actually very easy to have non-sequential, or non-deterministic programs. Let’s first refactor the above program:

In [2]:
x = 10

def f():
    global x
    x += 10
    x *= 2
    x -= 7
    return x

print(f())
33

So far so good. But suppose you have two instances of the f process executing concurrently. In the following, we illustrate the interleaving of statements due to concurrency by putting two copies of the function side by side, and the execution order by the order that the statements appear:

x = 10

def f1():                  | def f2():
    global x # x == 10     |
                           |     global x # x == 10
    x += 10 # x == 20      |
                           |     x += 10 # x == 30
    x *= 2  # x == 40      |
                           |     x *= 2  # x == 80
    x -= 7  # x == 73      |
    return x               |
                           |     x -= 7  # x == 66
                           |     return x

print('f1 ->', f1())
print('f2 ->', f2())

We will get the results:

f1 -> 73
f2 -> 66

In this example, if you are only in control of f1 you will be very much baffled. As you can try for yourself, by tweaking the order further you can get other results. This is despite the fact that within each function itself the sequence of statements is the same as before. So in our lingo, within a single process, the execution is now non-deterministic. We also call such processes non-sequential in order to align with the name CSP, although this terminology is obviously prone to misunderstanding.

In this case, the fix is actually not that hard: don’t modify global variables. Any modifications you do must be local to your process. In functional languages, it is sometimes enforced that you cannot make any modifications at all — any computation you do just returns new values without stepping on the old values. However, we are writing python, and in python, such restriction is both unnecessary and unnatural. We only need to disallow operations that can interfere with other processes.

Now, you ask, what disturbed minds would write something like our f? Well, be assured that that people who wrote f habour no ill intensions. The reason that they reach for global variables is most often the need for input/output, or IO (note that the concept of IO is much broader than file or network accesses). We need to get stuff into our process to compute, and we need to notify other processes who are also computing what our results are.

Indeed, IO is the whole point of computation: we, at our keyboards (or touch screens, or whatever your newest VR/AR interaction devices), input something for the computer to compute, the the computer returns the results to us. Programs without IO are pretty useless. Using global variables for IO is also rather convenient: we just take something (input) from predetermined boxes (or memory addresses), and when we are done, just put the result into some other predetermined boxes. Other processes, by inspecting the boxes, will know what we have done. At the lowest level, this is actually what our current computer architecture dictates. A “pure” function that “returns” something without reference to an address is an illusion. But unfortunately, as we have seen, this crude arrangement results in processes stepping on each other and chaos if there are no rules.

The old and mainstream solution is that we put stickers on the boxes, or locks on memory addresses, when we want to operate on them: “in use — don’t touch until I’m done!” This solves the problem, but using locks and similar concurrency primitives turn out to be rather delicate and error-prone. Suppose you and your friend both want to operate on two boxes A and B. You go forward and put your sticker on A, meanwhile your friend has already put his sticker on B. Now both of you are stuck: unless one of you back off, no one can go forward. Preventing such deadlocks means having a whole lot of disciplines and guidelines to follow — more training to become professional jugglers!

Is there a way out of this? Is there a way to avoid arduous juggler training while still doing concurrency? Yes, and this is what the communicating part of CSP says.

The basic idea is this: when doing computations that must involve IO, instead of boxes, we use meeting points, or rendezvous. For example, you and your friend both want a book. Instead of putting the book in a box so that both of you can do whatever you want with it whenever you want (and risking the book to be stolen), you just take the book away and do your processing with it. After you are satisfied, you and your friend meet together and you hand off the book to your friend. Once your friend has the book, she can do anything she wants with it, while you can no longer do anything with it at all. There is no longer any stepping over. If you really want your book again, you must arange with your friend for a hand-off again.

Such an arrangment is psychologically familiar (it is how private properties work). Those meeting points give us back sequential process: things look as if we are in a non-concurrent program, and the only surprises are where we expect them to be: at the meeting points, just like we expect to find something we don’t already know when we read a file in. No stickers. No locks.

So does CSP eliminate all deadlocks? No, but almost yes. What we mean is that it is of course possible to have a deadlock in CSP, by for example waiting on a rendezvous that no one else would complete. However, such situations are rather obvious, and if they ever occur in your programs, they are very easy to spot. You probably won’t encounter any deadlocks other than the most obvious and most trivial ones. This is unlike working with locks where you need to be careful on every step. What this means is that CSP is on a much higher level of abstraction than locks and synchronizations. It is no coincidence that working with locks very often feels like doing pointer arithmetics.

Not only does communicating over rendezvous solve the majority of problems that traditionally require the use of lock, it also solves these problems while respecting the privacy, or abstraction barriers, of the participating processes. Consider the box-book example again. If we want to use stickers to solve it, you and your friend both have to agree on a strategy, for example, always start with box A. Now you are both opening yourselves up to each other, letting the other know things about how you operate, which you may be reluctant to with someone you just met. By contrast, when using rendezvous for communication, the existence of rendezvous and whether you intend to use it for reading or writing is often sufficient for correct execution. The abstraction barrier is respected!

The rest of this tutorial will go into much more details in how to go about setting up and honouring rendezvous, which in the context of aiochan, is called a channel, or Chan. But first, we need to deal with some environment setups in the next section.

To recap, in the context of CSP (Communicating Sequantial Processes):

  • Processes are group of codes that can be considered as an independent entity.
  • Sequential processes are processes that operate in deterministic order producing deterministic results, without danger of stepping over each other.
  • Communicating sequantial processes are sequantial processes that do their IO by rendezvous only.
  • CSP style concurrency enables natural program logic resembling non-concurrent codes, respects abstraction barriers, while at the same time eliminating most of the dangers of deadlocks.