1. Overview#
When Google initially wrote Golang, it was to address the high concurrency needs of its internal business, and one of Golang's major features is high concurrency. This article introduces the principles, concepts, and technical points related to Golang's high concurrency.
I will first introduce some concepts, such as: parallelism and concurrency, processes, threads, and goroutines, along with their differences. Then I will introduce goroutines and channels in Golang, which are key to implementing high concurrency in Golang. I will also discuss select, timers, runtime, and synchronization locks, and finally introduce the concurrency advantages of Go, its concurrency model, and Go's scheduler.
2. Parallelism and Concurrency#
If you have studied operating systems, you should be familiar with parallelism and concurrency.
At the same time, multiple instructions are executed simultaneously on multiple processors.
In the same moment, only one instruction can be executed, but multiple process instructions are quickly rotated (different rotation algorithms apply based on different situations).
Differences between parallelism and concurrency:
- Parallelism exists in multi-processor systems, while concurrency can exist in both single-processor and multi-processor systems.
- Parallelism requires the program to execute multiple operations simultaneously, while concurrency only requires the program to pretend to execute multiple operations simultaneously (executing one operation per time slice, then rotating through multiple operations).
3. Processes, Threads, Goroutines#
A process is an execution environment that contains computer instructions, user data, and system data, as well as other types of resources it is allowed to acquire.
Threads are smaller and lighter entities compared to processes. A thread is created by a process and contains its own control flow and stack. The difference between processes and threads is that a process is an executing binary file, while a thread is a subset of a process.
Goroutines are the smallest unit of concurrent execution in Go programs. Unlike Unix processes, goroutines are not autonomous entities. The main advantage of goroutines is that they are very lightweight, allowing thousands to run easily. Goroutines are lighter than threads and require a process environment to exist. When creating a goroutine, a process is needed, and this process must have at least one thread. A goroutine is a user-space lightweight thread, and its scheduling is entirely controlled by the user. Switching between goroutines only requires saving the context of the task, with no kernel overhead. The stack space for threads is typically 2M, while the minimum stack space for goroutines is 2K.
4. Goroutine#
Having introduced the concept of goroutines (hereafter uniformly referred to as goroutines), let's look at the actual syntax for goroutines.
In Go, you can start a new goroutine by using the go
keyword followed by the function name or by defining a complete anonymous function. When you call a function with the go
keyword, it returns immediately, and the function runs in the background as a goroutine, while the rest of the program continues to execute.
Creating a goroutine:
package main
import (
"fmt"
"time"
)
func main() {
go function()
go func() {
for i := 10; i < 20; i++ {
fmt.Print(i, " ")
}
}()
time.Sleep(1 * time.Second)
}
func function() {
for i := 0; i < 10; i++ {
fmt.Print(i)
}
fmt.Println()
}
You may notice that the output above is not fixed (the main function may finish early). We can use the sync package to solve this problem.
package main
import (
"flag"
"fmt"
"sync"
)
func main() {
n := flag.Int("n", 20, "Number of goroutines")
flag.Parse()
count := *n
fmt.Printf("Going to create %d goroutines.\n", count)
var waitGroup sync.WaitGroup // Define a variable of type sync.WaitGroup
fmt.Printf("%#v\n", waitGroup)
for i := 0; i < count; i++ { // Use a for loop to create the required number of goroutines
waitGroup.Add(1) // Each call increments the counter in the sync.WaitGroup variable to prevent any race conditions
go func(x int) {
defer waitGroup.Done() // Decrement the sync.WaitGroup variable
fmt.Printf("%d ", x)
}(i)
}
fmt.Printf("%#v\n", waitGroup)
waitGroup.Wait() // The sync.Wait call will block until the counter in the sync.WaitGroup variable is 0, ensuring all goroutines complete execution
fmt.Println("\nExiting...")
}
5. Channel#
Channels are a communication mechanism in Go that allows data transmission between goroutines.
Some explicit rules:
- Each channel only allows the exchange of data of a specified type, which is the element type of the channel.
- For a channel to operate normally, it must ensure that there is a method to receive data from the channel.
You can declare a channel using the chan
keyword, and you can close the channel using the close()
function.
When using a channel as a function parameter, you can specify it as a one-way channel.
5.1 Writing to a Channel#
package main
import (
"fmt"
"time"
)
func main() {
c := make(chan int)
go writeToChannel(c, 10)
time.Sleep(1 * time.Second)
}
func writeToChannel(c chan int, x int) {
fmt.Println(x)
c <- x
close(c)
fmt.Println(x)
}
5.2 Receiving Data from a Channel#
package main
import (
"fmt"
"time"
)
func main() {
c := make(chan int)
go writeToChannel(c, 10)
time.Sleep(1 * time.Second)
fmt.Println("Read:", <-c)
time.Sleep(1 * time.Second)
_, ok := <-c
if ok {
fmt.Println("Channel is open!")
}else {
fmt.Println("Channel is closed!")
}
}
func writeToChannel(c chan int, x int) {
fmt.Println("l", x)
c <- x
close(c)
fmt.Println("2", x)
}
5.3 Passing Channels as Function Parameters#
package main
import (
"fmt"
//"time"
)
func main() {
c := make(chan bool, 1)
for i := 0; i < 10; i++ {
go Go(c, i)
}
<-c
}
func Go(c chan bool, index int) {
sum := 0
for i := 0; i < 1000000; i++ {
sum += i
}
fmt.Println(sum)
c <- true
}
6. Select#
The select statement in Go looks like a switch statement for channels. In fact, select allows a goroutine to wait for multiple communication operations, so the main benefit of using select is that it can handle multiple channels and perform non-blocking operations.
Note: The biggest issue with using channels and select is deadlock. To resolve deadlock issues, synchronization locks will be introduced later.
package main
import(
"fmt"
"math/rand"
"os"
"strconv"
"time"
)
func main() {
rand.Seed(time.Now().Unix())
createNumber := make(chan int)
end := make(chan bool)
if len(os.Args) != 2 {
fmt.Println("Please give me an integer!")
return
}
n, _ := strconv.Atoi(os.Args[1])
fmt.Printf("Going to create %d random numbers.\n", n)
go gen(0, 2*n, createNumber, end)
for i := 0; i < n; i++ {
fmt.Printf("%d ", <-createNumber)
}
time.Sleep(5 * time.Second) // Give enough time for the time.After() function in gen() to return and activate the select branch
fmt.Println("Exiting...")
end <- true // Activate the case->end branch in the select statement inside gen() to terminate the program and execute related code
}
func gen(min, max int, createNumber chan int, end chan bool) {
for {
select {
case createNumber <- rand.Intn(max-min) + min:
case <- end:
close(end)
return
case <- time.After(4 * time.Second): // The time.After function returns after a specified time, thus unlocking the select statement when other channels are blocked
fmt.Println("\ntime.After()!") // This case can be treated as a default branch
}
}
}
Note: The select statement does not require a default branch.
The select statement does not evaluate in order, as all channels are checked simultaneously.
If none of the channels in the select statement are ready, the select statement will block until a channel is ready, at which point the Go runtime will make a random selection among the ready channels to ensure fairness.
The greatest advantage of select is that it can connect, orchestrate, and manage multiple channels.
When channels connect goroutines, select connects those channels that connect the goroutines.
7. Timers#
The select statement also uses timers, so what is a timer?
A timer is a mechanism that executes a task at a future point in time by setting it.
There are two types of timers:
- One-time delay mode
- Interval mode that executes repeatedly after a certain period
Go's timers are quite comprehensive, with all APIs in the time package.
7.1 Delay Mode#
There are two types of delayed execution: time.After and time.Sleep.
7.1.1 time.After#
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("1")
timeAfterTrigger := time.After(1 * time.Second)
<-timeAfterTrigger
fmt.Println("2")
}
The time package provides several pre-defined constants of type int.
const (
Nanosecond Duration = 1
Microsecond = 1000 * Nanosecond
Millisecond = 1000 * Microsecond
Second = 1000 * Millisecond
Minute = 60 * Second
Hour = 60 * Minute
)
7.1.2 time.Sleep#
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("1")
time.Sleep(1 * time.Second)
fmt.Println("2")
}
The difference between the two is that time.Sleep blocks the current goroutine, while time.After is based on channels and can be passed between different goroutines.
7.2 Interval Mode#
The interval mode can be divided into two types: one that ends after executing N times and another that continuously executes without stopping.
7.2.1 time.NewTicker#
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("1")
count := 0
timeTicker := time.NewTicker(1 * time.Second)
for {
<-timeTicker.C
fmt.Println("Output 2 every 1 second")
count++
if count >= 5 {
timeTicker.Stop()
}
}
}
7.2.2 time.Tick#
package main
import (
"fmt"
"time"
)
func main() {
t := time.Tick(1 * time.Second)
for {
<-t
fmt.Println("Output once every 1 second")
}
}
7.3 Controlling Timers#
Timers provide the Stop and Reset methods.
- The Stop method stops the timer.
- The Reset method changes the timer's interval.
7.3.1 time.Stop#
package main
import (
"fmt"
"time"
)
func main() {
timer := time.NewTimer(time.Second * 6)
go func() {
<-timer.C
fmt.Println("Time's up")
}()
timer.Stop()
}
7.3.2 time.Reset#
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("1")
count := 0
timeTicker := time.NewTicker(1 * time.Second)
for {
<-timeTicker.C
fmt.Println("2")
count++
if count >= 3 {
timeTicker.Reset(2 * time.Second)
}
}
}
8. Runtime#
The runtime is the infrastructure required for Go language execution, such as controlling goroutine functionality, debugging, and support for pprof, trace, and race detection, memory allocation, system operations, and CPU-related operations (signal handling, system calls, register operations, atomic operations, etc.), as well as the implementation of built-in types like map, channel, string, and reflection.
Unlike the runtime in Java and Python, which is a virtual machine, Go's runtime is compiled together with user code into a single executable file.
Development history of runtime:
9. Synchronization Locks#
The biggest issue with channels and select mentioned earlier is deadlock. This section introduces the solution to deadlock issues—synchronization locks.
Go provides two ways to implement synchronization locks: atomic locks and mutex locks.
9.1 Atomic Locks#
You can send messages to all goroutines using a signal.
package main
import (
"fmt"
"sync"
"sync/atomic"
"time"
)
var (
shotdown int64 // This flag notifies multiple goroutines of the status
wg sync.WaitGroup
)
func main() {
wg.Add(2)
go doWork("A")
go doWork("B")
time.Sleep(1 * time.Second)
atomic.StoreInt64(&shotdown, 1) // Modify
wg.Wait()
}
func doWork(s string) {
defer wg.Done()
for {
fmt.Printf("Doing homework %s\n", s)
time.Sleep(2 * time.Second)
if atomic.LoadInt64(&shotdown) == 1 { // Read
fmt.Printf("Shotdown homework %s\n", s)
break
}
}
}
9.2 Mutex Locks#
Using a mutex, you can encapsulate a critical section to allow only a single goroutine to execute.
package main
import (
"fmt"
"runtime"
"sync"
)
var (
counter int
wg sync.WaitGroup
mutex sync.Mutex // Define the critical section
)
func main() {
wg.Add(2)
go incCount(1)
go incCount(2)
wg.Wait()
fmt.Printf("Final Counter: %d\n", counter)
}
func incCount(i int) {
defer wg.Done()
for count := 0; count < 2; count++ {
mutex.Lock()
{
value := counter
runtime.Gosched()
value++
counter = value
}
mutex.Unlock()
}
}
10. Advantages of Go Concurrency#
Go's built-in higher-level API for concurrent programming is based on the CSP (communicating sequential processes) model. This means that explicit locks can be avoided, as Go synchronizes data transmission through channels, greatly simplifying the writing of concurrent programs.
In general, a typical desktop computer running a dozen or twenty threads can be overloaded, but the same machine can easily handle hundreds, thousands, or even tens of thousands of goroutines competing for resources.
11. Go Concurrency Model#
Go implements two forms of concurrency:
- Multi-threaded shared memory (communicating through shared memory)
- CSP (communicating sequential processes) concurrency model (sharing memory through communication)
Do not communicate by sharing memory; instead, share memory by communicating.
Threads in Java, C++, and Python communicate through shared memory.
Go's CSP concurrency model is implemented through goroutines and channels.
Example of using goroutines and channels together:
package main
import (
"fmt"
)
// Write Data
func writeData(intChan chan int) {
for i := 1; i <= 50; i++ {
// Put data
intChan <- i
fmt.Println("writeData ", i)
}
close(intChan) // Close
}
// Read Data
func readData(intChan chan int, exitChan chan bool) {
for {
v, ok := <-intChan
if !ok {
break
}
fmt.Printf("readData read data=%v\n", v)
}
// After reading all data, the task is complete
exitChan <- true
close(exitChan)
}
func main() {
// Create two channels
intChan := make(chan int, 10)
exitChan := make(chan bool, 1)
go writeData(intChan)
go readData(intChan, exitChan)
for {
_, ok := <-exitChan
if !ok {
break
}
}
}
12. Go Scheduler#
The Go scheduler uses three structures:
G represents a goroutine, with each Goroutine corresponding to a G structure. G stores the goroutine's runtime stack, state, and task function, and can be reused.
M represents a kernel thread, which is the actual resource that performs the computation. After binding to a valid P, it enters the scheduling loop; the scheduling loop mechanism roughly retrieves from the Global queue, P's Local queue, and the wait queue.
P represents a logical processor, indicating the context of scheduling. It can be seen as a local scheduler that allows Go code to run on a separate thread. This is key to mapping Go from an N:1 scheduler to an M scheduler.
For G, P is equivalent to a CPU core, and G can only be scheduled when bound to P.
For M, P provides the relevant execution environment (Context), such as memory allocation state (mcache), task queue (G), etc.
The number of Ps determines the maximum number of Gs that can run in parallel in the system (provided that the number of physical CPU cores >= the number of Ps).
The number of Ps is determined by the user-set GoMAXPROCS, but regardless of how large GoMAXPROCS is set, the maximum number of Ps is 256.
Using the classic whack-a-mole model to illustrate the relationship among the three:
The task of the mole is: there are several bricks on the construction site, and the mole uses a cart to transport the bricks to the fire source.
13. Conclusion#
This article introduced some knowledge related to Golang concurrency, starting from basic concepts such as parallelism, concurrency, processes, threads, and goroutines, to practical uses of Golang concurrency, including goroutines, channels, select, timers, and synchronization locks. It also briefly introduced runtime and concluded with Go's scheduler model.