29.2. 記憶體管理

29.2.1. 具有 4 級頁表的完整虛擬記憶體對映

注意

  • 諸如 “-23 TB” 之類的負地址是位元組為單位的絕對地址,從 64 位地址空間的頂部向下計數。 以絕對地址和與頂部距離的符號來看佈局更容易理解。

    例如 0xffffe90000000000 == -23 TB,它比 64 位地址空間的頂部 (ffffffffffffffff) 低 23 TB。

    請注意,當我們接近地址空間的頂部時,符號會從 TB 更改為 GB,然後更改為 MB/KB。

  • “16M TB” 乍一看可能很奇怪,但它是一種比 “16 EB” 更容易視覺化大小符號的方式,很少有人會乍一看就認出 16 艾位元組。 它也很好地展示了 64 位地址空間是多麼的巨大。

========================================================================================================================
    Start addr    |   Offset   |     End addr     |  Size   | VM area description
========================================================================================================================
                  |            |                  |         |
 0000000000000000 |    0       | 00007fffffffefff | ~128 TB | user-space virtual memory, different per mm
 00007ffffffff000 | ~128    TB | 00007fffffffffff |    4 kB | ... guard hole
__________________|____________|__________________|_________|___________________________________________________________
                  |            |                  |         |
 0000800000000000 | +128    TB | 7fffffffffffffff |   ~8 EB | ... huge, almost 63 bits wide hole of non-canonical
                  |            |                  |         |     virtual memory addresses up to the -8 EB
                  |            |                  |         |     starting offset of kernel mappings.
                  |            |                  |         |
                  |            |                  |         | LAM relaxes canonicallity check allowing to create aliases
                  |            |                  |         | for userspace memory here.
__________________|____________|__________________|_________|___________________________________________________________
                                                            |
                                                            | Kernel-space virtual memory, shared between all processes:
__________________|____________|__________________|_________|___________________________________________________________
                  |            |                  |         |
 8000000000000000 |   -8    EB | ffff7fffffffffff |   ~8 EB | ... huge, almost 63 bits wide hole of non-canonical
                  |            |                  |         |     virtual memory addresses up to the -128 TB
                  |            |                  |         |     starting offset of kernel mappings.
                  |            |                  |         |
                  |            |                  |         | LAM_SUP relaxes canonicallity check allowing to create
                  |            |                  |         | aliases for kernel memory here.
____________________________________________________________|___________________________________________________________
                  |            |                  |         |
 ffff800000000000 | -128    TB | ffff87ffffffffff |    8 TB | ... guard hole, also reserved for hypervisor
 ffff880000000000 | -120    TB | ffff887fffffffff |  0.5 TB | LDT remap for PTI
 ffff888000000000 | -119.5  TB | ffffc87fffffffff |   64 TB | direct mapping of all physical memory (page_offset_base)
 ffffc88000000000 |  -55.5  TB | ffffc8ffffffffff |  0.5 TB | ... unused hole
 ffffc90000000000 |  -55    TB | ffffe8ffffffffff |   32 TB | vmalloc/ioremap space (vmalloc_base)
 ffffe90000000000 |  -23    TB | ffffe9ffffffffff |    1 TB | ... unused hole
 ffffea0000000000 |  -22    TB | ffffeaffffffffff |    1 TB | virtual memory map (vmemmap_base)
 ffffeb0000000000 |  -21    TB | ffffebffffffffff |    1 TB | ... unused hole
 ffffec0000000000 |  -20    TB | fffffbffffffffff |   16 TB | KASAN shadow memory
__________________|____________|__________________|_________|____________________________________________________________
                                                            |
                                                            | Identical layout to the 56-bit one from here on:
____________________________________________________________|____________________________________________________________
                  |            |                  |         |
 fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
                  |            |                  |         | vaddr_end for KASLR
 fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
 fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
 ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
 ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
 ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
 ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
 ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
 ffffffff80000000 |-2048    MB |                  |         |
 ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
 ffffffffff000000 |  -16    MB |                  |         |
    FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
 ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
 ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
__________________|____________|__________________|_________|___________________________________________________________

29.2.2. 具有 5 級頁表的完整虛擬記憶體對映

注意

  • 使用 56 位地址,使用者空間記憶體擴大了 512 倍,從 0.125 PB 增加到 64 PB。 所有核心對映都向下移動到 -64 PB 的起始偏移量,並且許多區域都會擴充套件以支援更大的物理記憶體。

========================================================================================================================
    Start addr    |   Offset   |     End addr     |  Size   | VM area description
========================================================================================================================
                  |            |                  |         |
 0000000000000000 |    0       | 00fffffffffff000 |  ~64 PB | user-space virtual memory, different per mm
 00fffffffffff000 |  ~64    PB | 00ffffffffffffff |    4 kB | ... guard hole
__________________|____________|__________________|_________|___________________________________________________________
                  |            |                  |         |
 0100000000000000 |  +64    PB | 7fffffffffffffff |   ~8 EB | ... huge, almost 63 bits wide hole of non-canonical
                  |            |                  |         |     virtual memory addresses up to the -8EB TB
                  |            |                  |         |     starting offset of kernel mappings.
                  |            |                  |         |
                  |            |                  |         | LAM relaxes canonicallity check allowing to create aliases
                  |            |                  |         | for userspace memory here.
__________________|____________|__________________|_________|___________________________________________________________
                                                            |
                                                            | Kernel-space virtual memory, shared between all processes:
____________________________________________________________|___________________________________________________________
 8000000000000000 |   -8    EB | feffffffffffffff |   ~8 EB | ... huge, almost 63 bits wide hole of non-canonical
                  |            |                  |         |     virtual memory addresses up to the -64 PB
                  |            |                  |         |     starting offset of kernel mappings.
                  |            |                  |         |
                  |            |                  |         | LAM_SUP relaxes canonicallity check allowing to create
                  |            |                  |         | aliases for kernel memory here.
____________________________________________________________|___________________________________________________________
                  |            |                  |         |
 ff00000000000000 |  -64    PB | ff0fffffffffffff |    4 PB | ... guard hole, also reserved for hypervisor
 ff10000000000000 |  -60    PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI
 ff11000000000000 |  -59.75 PB | ff90ffffffffffff |   32 PB | direct mapping of all physical memory (page_offset_base)
 ff91000000000000 |  -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole
 ffa0000000000000 |  -24    PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base)
 ffd2000000000000 |  -11.5  PB | ffd3ffffffffffff |  0.5 PB | ... unused hole
 ffd4000000000000 |  -11    PB | ffd5ffffffffffff |  0.5 PB | virtual memory map (vmemmap_base)
 ffd6000000000000 |  -10.5  PB | ffdeffffffffffff | 2.25 PB | ... unused hole
 ffdf000000000000 |   -8.25 PB | fffffbffffffffff |   ~8 PB | KASAN shadow memory
__________________|____________|__________________|_________|____________________________________________________________
                                                            |
                                                            | Identical layout to the 47-bit one from here on:
____________________________________________________________|____________________________________________________________
                  |            |                  |         |
 fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
                  |            |                  |         | vaddr_end for KASLR
 fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
 fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
 ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
 ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
 ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
 ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
 ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
 ffffffff80000000 |-2048    MB |                  |         |
 ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
 ffffffffff000000 |  -16    MB |                  |         |
    FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
 ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
 ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
__________________|____________|__________________|_________|___________________________________________________________

架構定義了一個 64 位虛擬地址。 實現可以支援更少的位數。 目前支援 48 位和 57 位虛擬地址。 位 63 到最高有效實現位是符號擴充套件的。 如果將它們解釋為無符號數,這會導致使用者空間和核心地址之間出現漏洞。

直接對映覆蓋系統中所有記憶體,直到最高記憶體地址(這意味著在某些情況下它也可以包括 PCI 記憶體空洞)。

我們在 ‘efi_pgd’ PGD 中對映 EFI 執行時服務,位於 64GB 的大虛擬記憶體視窗中(此大小是任意的,如果需要,以後可以增加)。 這些對映不屬於任何其他核心 PGD,僅在 EFI 執行時呼叫期間可用。

請注意,如果啟用了 CONFIG_RANDOMIZE_MEMORY,則所有物理記憶體、vmalloc/ioremap 空間和虛擬記憶體對映的直接對映都會被隨機化。 它們的順序被保留,但它們的基礎將在啟動時提前偏移。

在更改此處的任何內容時,請非常小心 KASLR。 KASLR 地址範圍不得與任何內容重疊,除非 KASAN 影子區域,這是正確的,因為 KASAN 停用了 KASLR。

對於 4 級和 5 級佈局,最後 2MB 漏洞中的 STACKLEAK_POISON 值為:ffffffffffff4111