← Back
后端开发 2026.03.06

Go Senior Engineer's Lecture (MOOC) 008_GMP Scheduler and Go Design Philosophy

后端开发

Corresponding videos: 9-2 Go Language Scheduler, 18-1 Understanding Go Language Design, 18-2 Course Summary

1. Evolution of Go Scheduler

1.0 Era: Single-threaded Scheduler (Go 0.x)

  • Only one thread runs goroutines
  • All goroutines wait in a queue
  • Cannot utilize multiple cores

1.1 Era: Multi-threaded Scheduler (Go 1.0)

  • Introduced multi-threading
  • But severe global lock contention, performance bottleneck

1.2+ Era: GMP Model (Go 1.2 to Present)

Introduced the famous GMP scheduling model.

2. Detailed Explanation of GMP Model

    G (Goroutine)     - 协程,用户级轻量线程
    M (Machine/Thread) - 操作系统线程
    P (Processor)      - 逻辑处理器,调度上下文

2.1 Relationship between the three

                   ┌─────────┐
                   │ Go 程序  │
                   └────┬────┘
                        │ 创建 goroutine
           ┌────────────┼────────────┐
           ▼            ▼            ▼
        ┌─────┐     ┌─────┐     ┌─────┐
        │  G  │     │  G  │     │  G  │    ... (成千上万个)
        └──┬──┘     └──┬──┘     └──┬──┘
           │           │           │
     ┌─────┴─────┐     │     ┌────┴─────┐
     │ P 的本地队列│     │     │ P 的本地队列│
     │ [G][G][G] │     │     │ [G][G]   │
     └─────┬─────┘     │     └────┬─────┘
           │           │          │
        ┌──┴──┐     ┌──┴──┐   ┌──┴──┐
        │  P  │     │  P  │   │  P  │    (GOMAXPROCS 个)
        └──┬──┘     └──┬──┘   └──┬──┘
           │           │          │
        ┌──┴──┐     ┌──┴──┐   ┌──┴──┐
        │  M  │     │  M  │   │  M  │    (按需创建)
        └──┬──┘     └──┬──┘   └──┬──┘
           │           │          │
     ══════╧═══════════╧══════════╧══════
              操作系统内核线程

2.2 Detailed explanation of each component

G (Goroutine) - Initial stack size is only 2KB (threads usually 1-8MB) - Stack can grow and shrink dynamically - Contains the executing function, stack pointer, program counter, etc. - States: runnable, running, waiting, completed

M (Machine) - Corresponds to an operating system thread - Created by Go runtime as needed, up to 10000 by default - M must hold a P to run a G - When M is blocked by a system call, P will be taken by another M

P (Processor) - Number determined by GOMAXPROCS (defaults to the number of CPU cores) - Each P has a local run queue (up to 256 Gs) - P is the core of scheduling, deciding which G runs on which M

2.3 Scheduling Policy

// 查看和设置 P 的数量
runtime.GOMAXPROCS(0)    // 获取当前值
runtime.GOMAXPROCS(4)    // 设置为 4
runtime.NumCPU()         // CPU 核数

Scheduling opportunities (possible goroutine switching points):

Switching PointDescription
I/O operationsFile, network read/write
channel operationsWhen sending/receiving blocks
selectMultiplexing
Waiting for lockssync.Mutex, etc.
Function callsCompiler inserts checkpoints at function entry
runtime.Gosched()Manual yielding
GCSTW phase of garbage collection
System callsM and P separate when syscall blocks

2.4 Work Stealing

When a P’s local queue is empty:

  1. First, it tries to get Gs from the global queue.
  2. If the global queue is also empty → it randomly steals half of the Gs from another P’s queue.
  3. If still none → it checks the network poller.
  4. If still none → M sleeps.
    P1 [空]           P2 [G5 G6 G7 G8]
       │                    │
       │   偷一半!          │
       │ ◄───── steal ───── │
       │                    │
    P1 [G7 G8]        P2 [G5 G6]

2.5 System Call Handling (Hand Off)

When a G executes a system call (e.g., file I/O) and blocks M:

  正常状态:     M1 ──── P1 ──── G1(syscall阻塞)

  Hand Off:    M1 ──── G1(继续阻塞在syscall)
               M2 ──── P1 ──── G2(P被转给新的M)

P will be handed off to an idle M (or a new M will be created) to keep P busy.

2.6 Network Poller (netpoller)

Network I/O does not block M; instead, it uses epoll/kqueue for asynchronous processing:

  1. G initiates a network call → G is attached to the netpoller.
  2. M is not blocked and continues to run other Gs.
  3. When the network is ready → G is put back into the runnable queue.

This is why Go can efficiently handle a large number of network connections.

3. Comparison of Goroutines and Threads

FeatureGoroutineOS Thread
Stack size2KB (dynamic growth)1-8MB (fixed)
Creation cost~0.3μs~30μs
Switching cost~0.2μs (user mode)~1μs (kernel mode)
Scheduling methodCooperative + PreemptivePreemptive
MagnitudeMillionsThousands
Communication methodchannelShared memory + locks

Preemptive Scheduling (Go 1.14+)

Before Go 1.14, it was purely cooperative scheduling, and CPU-intensive goroutines could occupy M for a long time:

// Go 1.14 之前,这个会独占线程
go func() {
    for {} // 死循环,不让出
}()

Go 1.14+ introduced signal-based preemption (SIGURG), allowing preemption even without function calls.

4. Observing the Scheduler in Practice

# GODEBUG 查看调度器状态
GODEBUG=schedtrace=1000 go run main.go
# 每 1000ms 输出调度器状态

# 更详细
GODEBUG=schedtrace=1000,scheddetail=1 go run main.go

# go tool trace 可视化
// 在代码中查看
fmt.Println("goroutine数:", runtime.NumGoroutine())
fmt.Println("CPU核数:", runtime.NumCPU())
fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0))

5. Go Language Design Philosophy

Corresponding video Ch18: Understanding Go Language Design

5.1 Less is more

  • Only 25 keywords (C has 32, Java has 50+)
  • No classes, replaced by struct + method
  • No inheritance, replaced by composition (embedding)
  • No generics (before Go 1.18) — keeping it simple
  • No exceptions, replaced by multiple return values + error instead of try/catch
  • Only for loops, no while/do-while
  • No ternary operator ?:

5.2 Composition over Inheritance

// 不是继承,是组合
type Animal struct {
    Name string
}
func (a Animal) Speak() string { return a.Name + " speaks" }

type Dog struct {
    Animal  // 嵌入,不是继承
    Breed string
}
// Dog 自动获得 Speak 方法,但可以覆盖

5.3 Implicit Interface Implementation

// 不需要 implements 关键字
type Stringer interface {
    String() string
}
// 任何有 String() string 方法的类型自动实现了 Stringer
// → 解耦了定义者和实现者

5.4 Concurrency is a First-Class Citizen

  • go keyword to start goroutines (not a library function)
  • chan is a built-in type (not a Queue from a library)
  • select is language-level multiplexing
  • “Don’t communicate by sharing memory; share memory by communicating.”

5.5 Toolchain Philosophy

ToolPurpose
gofmtStandardizes code style, no style wars
go vetStatic analysis, finds common errors
go testBuilt-in testing framework, no third-party needed
go docComments are documentation
go buildCompiles into a single binary, no dependencies
go modBuilt-in dependency management

5.6 Error Handling Philosophy

// 显式处理每个错误,不会被隐藏
f, err := os.Open("file.txt")
if err != nil {
    // 必须处理
}
// 虽然"啰嗦",但:
// 1. 错误路径清晰可见
// 2. 不会因为忘记 catch 而崩溃
// 3. 强迫你思考每个可能的失败

5.7 Go Proverbs by Rob Pike

ProverbMeaning
Don’t communicate by sharing memory, share memory by communicatingUse channels instead of locks
Concurrency is not parallelismConcurrency is structure, parallelism is execution
Channels orchestrate; mutexes serializeChannels orchestrate flow, mutexes serialize access
The bigger the interface, the weaker the abstractionSmaller interfaces are better
Make the zero value usefulZero value should be directly usable
interface{} says nothingEmpty interface expresses no information
Gofmt’s style is no one’s favorite, yet gofmt is everyone’s favoriteConsistent style is more important than pretty style
A little copying is better than a little dependencyA little copying is better than introducing a dependency
Clear is better than cleverClarity is better than cleverness
Errors are valuesErrors are values, can be handled programmatically
Don’t just check errors, handle them gracefullyHandle errors gracefully, don’t just check them
Don’t panicDon’t panic easily

6. Course Summary Mind Map

Go 语言核心
├── 基础语法
│   ├── 变量、常量、类型
│   ├── 控制流(if/for/switch)
│   └── 函数(多返回值、闭包、defer)
├── 面向对象
│   ├── struct + method(代替 class)
│   ├── 组合(代替继承)
│   └── interface(隐式实现、duck typing)
├── 函数式编程
│   ├── 闭包、高阶函数
│   ├── 装饰器/中间件模式
│   └── Functional Options
├── 错误处理
│   ├── error 接口、多返回值
│   ├── panic/recover(仅用于不可恢复错误)
│   └── 统一错误处理(errWrapper 模式)
├── 测试
│   ├── 表格驱动测试
│   ├── 性能测试(Benchmark)
│   ├── pprof 性能分析
│   └── Example 文档测试
├── 并发编程
│   ├── goroutine(GMP 调度模型)
│   ├── channel(CSP 通信模型)
│   ├── select(多路复用)
│   └── sync 包(WaitGroup、Mutex)
├── 标准库
│   ├── net/http
│   ├── encoding/json
│   └── html/template
└── 工程实践
    ├── 单任务→并发→分布式爬虫
    ├── ElasticSearch 集成
    ├── Docker 容器化
    └── RPC 分布式通信