Golang concurrency 101 - basic channel usage

In this post, I’m gonna review the Meilisearch repository which describes itself as a “Lightning Fast, Ultra Relevant and Typo-Tolerant Search Engine”. There were couple of things that caught my eye with this project...

Disclaimer: A lot of things that I describe here are very broad and I only scratch the surface here. I purposely shallow a lot of topics to give a general idea. In case you are not exactly sure what certains terms mean, you should deepen your knowledge to get a full picture. Having a context improves learning rate and make things easier to learn and remember. Having primary knowledge in Computer Science makes you learn things by heart without having knock it into your head.

Consider the code below

package main

import (
    "fmt"
)

var i = 0

func increment() {
    i++
}

func main() {
    go increment()

    fmt.Println("i: ", i)
}

What does it do? It simply creates a global integer variable, defines an `increment` function that increments it with every invocation. In the main function the program invokes the `increment` function in a separate goroutine and immediately prints the contents of `i` variable. This code is an example of how NOT to write concurrent code, there is a global state (`i` variable) which is mutated by a function (`increment`) in a separate thread (`goroutine`). But this post is not about the principles of writing concurrent code. This blog is about to show how many things should be considered while writing concurrent program and what are the techinques for dealing with it as well as built-in tools into Golang. You can consider this blog post as a shallow introduction to concurrency in programming languages by Golang example. Many of the concepts showed in this blog post will be suitable to apply in different programming languages.

Let's get back to our code. Do you know what will be it's output? Let's run it:

i:  0

Are you surprised? You shouldn't be, this is how concurrency works.

Concurrent is not serial

What does that mean? Begginer programmers, especially the ones that do not have CS graduate background are often self-taught (just like me). Sidenote: When I think about my begginings in programming it was all about HTML, CSS, PHP and some JavaScript. I had no idea how PHP works, I did not know what is a thread and that PHP is single threaded and the only thing that make my applications work for multiple users in parallel is because Apache webserver is assiging worker process for each incoming request. This is the knowledge that I possessed later, throughout countless hors spent on solving begginer's problems.

What does that mean? Begginer programmers, especially the ones that do not have CS graduate background are often self-taught (just like me). Sidenote: When I think about my begginings in programming it was all about HTML, CSS, PHP and some JavaScript. I had no idea how PHP works, I did not know what is a thread and that PHP is single threaded and the only thing that make my applications work for multiple users in parallel is because Apache webserver is assiging worker process for each incoming request. This is the knowledge that I possessed later, throughout countless hors spent on solving begginer's problems.

So what happened in our program above?

Simple put, the program finished its work in `main` function, terminating its execution, before another thread invoked by goroutine had a chance to do its work and increment the `i` value.
So how do we make it work, so that the printed value is actually incremented? We have to synchronize the program execution.

Concurrency is about synchronization

What do we synchronize? In concurrent program we need to synchronize the access to variables. In very short and general description it's because the CPU logical core is able to do only single operation at a time. Even though we live in the era of multi CPU/core architectures that can run mulitple programs in parallel, each of the cores are still doing a single operation at a time. Even more so, if a program is using multiple cores at a time, the CPU architecture constraints that only single core can mutate particular data at a time. In case it happens that we have a situation in which more than one logical core is trying to modify the data, we get concurrent write errors which result in unexpected behaviour. To overcome that issue we need to synchronize the threads in our program so that they have coordinated access to a data as well as their asynchronous nature is under control.
Remember how in the first example the program finished before the increment took place? Let's try to synchronize it in the simplest possible way.

package main

import (
    "fmt"
    "time"
)

var i = 0

func increment() {
    i++
}

func main() {
    go increment()

    time.Sleep(100 * time.Millisecond)

    fmt.Println("i: ", i)
}

What's the outcome? Correct, its `i: 1`. As you can see, we halt the `main` function execution for 100 milliseconds with `time.Sleep` function. It this code correct?

No. Its not correct in many ways but the most important one is that we don't know for sure if 100 milliseconds is enough to cover 100% cases. The thing with concurrent programs is that thread's exectution is not given for sure in any way, that also applies to their order as well as schedule. A the top level there is an operating system's scheduler that makes decisions about what programs should get hoe much CPU time. In the program above, it may happen that 100 milliseconds in some cases will not be enough to increment the variable. It's a rare case, but still possible. A program should be written in a way that we are able to determine its execution. Let's move on and try another example.

package main

import (
    "fmt"
)

var i = 0

func increment() {
    i++
}

func main() {
    go increment()

    iterations := 0
    for {
        iterations++
        if i > 0 {
            break
        }
    }

    fmt.Printf("i: %d, iterations: %d", i, iterations)
}

This is slightly better. After our goroutine started we set up a "checking" loop, which checks in every iteration if the variable is greater thant zero. This is slightly better than "sleep" example but still far from perfect. What if we there is another function that increments our number in separate goroutine? One thing to note here, you can see I added a simle counter which indicates, how many iterations the loop have to do before the variable is incremented. The result is variable and depends on the runtime, on my machine it's:

i: 1, iterations: 376041

Yes, it's over 300 000 iterations of a loop for incrementation to be completed. That's insane. The program spend so much effort in order to wait for just a single opeartion. As you can see, using loops is not also good idea, in fact it would also not be possible all of the time. Imagine the function doesn't mutate the state of any global variable, therefore we cannot check whether it's done with the work.

Concurrency is about communication

Golang has different approach to many thing than other languages. Concurrency is one of them. What's idiomatic for Golang is that it encourage access to data between goroutines by communicating rather by sharing the data. What that means is, Golang offers (amongs traditional concurrency tools: lock, mutextes, semaphores) channels as a way to communicate goroutines and synchronize their work. Let's check an example.

package main

import (
    "fmt"
)

var i = 0
var ch = make(chan bool)

func increment() {
    i++
    ch <- true
}

func main() {
    go increment()

    <-ch

    fmt.Printf("i: %d", i)
}

Channel usage as the most simple example. We still have a global variable, a function that mutates a global state, but we also have a global channel that perfectly synchronizes our goroutines, is readable and makes no assumptions to how the code works. It will always work the same way. I define a global variable holding our unbuffered channel. Then I write to it every time the increment function executes and increments the variable's value. In the `main` function I read a value from channel. The way channels are working in the most basic way in Golang is, the goroutine will hold and wait for the value to appear in the channel. And this is exactly what happens here. We know that a write operation to channel happens always after the variable is incremented. We expect to read from a channel before printing out the value. The determination here is that, printing will always occur after the incrementation takes place and there is no exception about that.

Conclusion

The conccurrency is a very broad and deep topic. I advise all programmers to learn what are the basic concepts of it in programming languages and CS itself. Then read about concepts of concurrency in Golang. This post is an introduction and I hope I'll create more of this series in the future where we will take a deeper look at goroutines, channels and ways to synchronize work in Golang concurrent programs.


Similar searches: golang concurrency / golang channels / golang goroutines

These posts might be interesting for you:

  1. Why Golang Nil Is Not Always Nil? Nil Explained
  2. Why appending to slice in Golang is dangerous? Common Slice Gotchas
Author: Peter

I'm a backend programmer for over 10 years now, have hands on experience with Golang and Node.js as well as other technologies, DevOps and Architecture. I share my thoughts and knowledge on this blog.