9.6. PCI NTB 功能

作者:

Kishon Vijay Abraham I <kishon@ti.com>

PCI 不透明橋 (NTB) 允許兩個主機系統透過將每個主機作為裝置暴露給另一個主機來進行通訊。NTB 通常支援在遠端機器上生成中斷、將記憶體範圍暴露為 BARs 並執行 DMA 的能力。它們還支援暫存器(scratchpad),這是 NTB 內部的記憶體區域,可從兩臺機器訪問。

PCI NTB 功能允許兩個不同的系統(或主機)透過配置端點例項來相互通訊,從而將一個系統中的事務路由到另一個系統。

在下圖中,PCI NTB 功能將 SoC 配置為多個 PCI 端點 (EP) 例項,使得來自一個 EP 控制器的事務被路由到另一個 EP 控制器。一旦 PCI NTB 功能將 SoC 配置為多個 EP 例項,HOST1 和 HOST2 就可以使用 SoC 作為橋樑進行相互通訊。

   +-------------+                                   +-------------+
   |             |                                   |             |
   |    HOST1    |                                   |    HOST2    |
   |             |                                   |             |
   +------^------+                                   +------^------+
          |                                                 |
          |                                                 |
+---------|-------------------------------------------------|---------+
|  +------v------+                                   +------v------+  |
|  |             |                                   |             |  |
|  |     EP      |                                   |     EP      |  |
|  | CONTROLLER1 |                                   | CONTROLLER2 |  |
|  |             <----------------------------------->             |  |
|  |             |                                   |             |  |
|  |             |                                   |             |  |
|  |             |  SoC With Multiple EP Instances   |             |  |
|  |             |  (Configured using NTB Function)  |             |  |
|  +-------------+                                   +-------------+  |
+---------------------------------------------------------------------+

9.6.1. 用於實現 NTB 的構造

  1. 配置區域

  2. 自用暫存器

  3. 對等暫存器

  4. 門鈴 (DB) 暫存器

  5. 記憶體視窗 (MW)

9.6.1.1. 配置區域:

配置區域是使用 NTB 端點功能驅動程式實現的 NTB 特有的構造。主機側和端點側的 NTB 功能驅動程式將使用此區域相互交換資訊。配置區域具有用於配置端點控制器的控制/狀態暫存器。主機可以寫入此區域以配置出站地址轉換單元 (ATU) 並指示鏈路狀態。端點可以在此區域中指示主機發出的命令的狀態。端點還可以使用此區域向主機指示暫存器偏移和記憶體視窗數量。

配置區域的格式如下所示。此處所有欄位均為 32 位。

      +------------------------+
      |         COMMAND        |
      +------------------------+
      |         ARGUMENT       |
      +------------------------+
      |         STATUS         |
      +------------------------+
      |         TOPOLOGY       |
      +------------------------+
      |    ADDRESS (LOWER 32)  |
      +------------------------+
      |    ADDRESS (UPPER 32)  |
      +------------------------+
      |           SIZE         |
      +------------------------+
      |   NO OF MEMORY WINDOW  |
      +------------------------+
      |  MEMORY WINDOW1 OFFSET |
      +------------------------+
      |       SPAD OFFSET      |
      +------------------------+
      |        SPAD COUNT      |
      +------------------------+
      |      DB ENTRY SIZE     |
      +------------------------+
      |         DB DATA        |
      +------------------------+
      |            :           |
      +------------------------+
      |            :           |
      +------------------------+
      |         DB DATA        |
      +------------------------+


COMMAND:

      NTB function supports three commands:

        CMD_CONFIGURE_DOORBELL (0x1): Command to configure doorbell. Before
      invoking this command, the host should allocate and initialize
      MSI/MSI-X vectors (i.e., initialize the MSI/MSI-X Capability in the
      Endpoint). The endpoint on receiving this command will configure
      the outbound ATU such that transactions to Doorbell BAR will be routed
      to the MSI/MSI-X address programmed by the host. The ARGUMENT
      register should be populated with number of DBs to configure (in the
      lower 16 bits) and if MSI or MSI-X should be configured (BIT 16).

        CMD_CONFIGURE_MW (0x2): Command to configure memory window (MW). The
      host invokes this command after allocating a buffer that can be
      accessed by remote host. The allocated address should be programmed
      in the ADDRESS register (64 bit), the size should be programmed in
      the SIZE register and the memory window index should be programmed
      in the ARGUMENT register. The endpoint on receiving this command
      will configure the outbound ATU such that transactions to MW BAR
      are routed to the address provided by the host.

        CMD_LINK_UP (0x3): Command to indicate an NTB application is
      bound to the EP device on the host side. Once the endpoint
      receives this command from both the hosts, the endpoint will
      raise a LINK_UP event to both the hosts to indicate the host
      NTB applications can start communicating with each other.

ARGUMENT:

      The value of this register is based on the commands issued in
      command register. See COMMAND section for more information.

TOPOLOGY:

      Set to NTB_TOPO_B2B_USD for Primary interface
      Set to NTB_TOPO_B2B_DSD for Secondary interface

ADDRESS/SIZE:

      Address and Size to be used while configuring the memory window.
      See "CMD_CONFIGURE_MW" for more info.

MEMORY WINDOW1 OFFSET:

      Memory Window 1 and Doorbell registers are packed together in the
      same BAR. The initial portion of the region will have doorbell
      registers and the latter portion of the region is for memory window 1.
      This register will specify the offset of the memory window 1.

NO OF MEMORY WINDOW:

      Specifies the number of memory windows supported by the NTB device.

SPAD OFFSET:

      Self scratchpad region and config region are packed together in the
      same BAR. The initial portion of the region will have config region
      and the latter portion of the region is for self scratchpad. This
      register will specify the offset of the self scratchpad registers.

SPAD COUNT:

      Specifies the number of scratchpad registers supported by the NTB
      device.

DB ENTRY SIZE:

      Used to determine the offset within the DB BAR that should be written
      in order to raise doorbell. EPF NTB can use either MSI or MSI-X to
      ring doorbell (MSI-X support will be added later). MSI uses same
      address for all the interrupts and MSI-X can provide different
      addresses for different interrupts. The MSI/MSI-X address is provided
      by the host and the address it gives is based on the MSI/MSI-X
      implementation supported by the host. For instance, ARM platform
      using GIC ITS will have the same MSI-X address for all the interrupts.
      In order to support all the combinations and use the same mechanism
      for both MSI and MSI-X, EPF NTB allocates a separate region in the
      Outbound Address Space for each of the interrupts. This region will
      be mapped to the MSI/MSI-X address provided by the host. If a host
      provides the same address for all the interrupts, all the regions
      will be translated to the same address. If a host provides different
      addresses, the regions will be translated to different addresses. This
      will ensure there is no difference while raising the doorbell.

DB DATA:

      EPF NTB supports 32 interrupts, so there are 32 DB DATA registers.
      This holds the MSI/MSI-X data that has to be written to MSI address
      for raising doorbell interrupt. This will be populated by EPF NTB
      while invoking CMD_CONFIGURE_DOORBELL.

9.6.1.2. 暫存器:

每個主機都在 NTB 端點控制器的記憶體中分配有自己的暫存器空間。它們可以從橋的兩側進行讀寫。它們被構建在 NTB 上的應用程式使用,可用於在裝置的雙方之間傳遞控制和狀態資訊。

暫存器有 2 部分
  1. 自用暫存器:主機的自身暫存器空間

  2. 對等暫存器:遠端主機的暫存器空間。

9.6.1.3. 門鈴暫存器:

門鈴暫存器用於主機之間相互中斷。

9.6.1.4. 記憶體視窗:

兩個主機之間實際的資料傳輸將透過記憶體視窗進行。

9.6.2. 建模構造:

為了實現 NTB 功能,需要建模 5 個或更多不同的區域(配置、自用暫存器、對等暫存器、門鈴、一個或多個記憶體視窗)。至少需要一個記憶體視窗,並允許有多個。所有這些區域都應對映到 BARs,以便主機訪問這些區域。

如果為每個區域分配一個 32 位 BAR,方案將如下所示

BAR 編號

使用的構造

BAR0

配置區域

BAR1

自用暫存器

BAR2

對等暫存器

BAR3

門鈴

BAR4

記憶體視窗 1

BAR5

記憶體視窗 2

但是,如果我們為每個區域分配一個單獨的 BAR,在一個僅支援 64 位 BAR 的平臺上,將沒有足夠的 BAR 來容納所有區域。

為了得到大多數平臺的支援,這些區域應以一種既能提供 NTB 功能又能確保主機不會訪問不應訪問的任何區域的方式進行打包並對映到 BARs。

EPF NTB 功能中採用以下方案

BAR 編號

使用的構造

BAR0

配置區域 + 自用暫存器

BAR1

對等暫存器

BAR2

門鈴 + 記憶體視窗 1

BAR3

記憶體視窗 2

BAR4

記憶體視窗 3

BAR5

記憶體視窗 4

透過此方案,對於基本的 NTB 功能,3 個 BAR 應該足夠了。

9.6.2.1. 配置/暫存器區域建模:

+-----------------+------->+------------------+        +-----------------+
|       BAR0      |        |  CONFIG REGION   |        |       BAR0      |
+-----------------+----+   +------------------+<-------+-----------------+
|       BAR1      |    |   |SCRATCHPAD REGION |        |       BAR1      |
+-----------------+    +-->+------------------+<-------+-----------------+
|       BAR2      |            Local Memory            |       BAR2      |
+-----------------+                                    +-----------------+
|       BAR3      |                                    |       BAR3      |
+-----------------+                                    +-----------------+
|       BAR4      |                                    |       BAR4      |
+-----------------+                                    +-----------------+
|       BAR5      |                                    |       BAR5      |
+-----------------+                                    +-----------------+
  EP CONTROLLER 1                                        EP CONTROLLER 2

上圖顯示了為 HOST1(連線到 EP 控制器 1)在本地記憶體中分配的配置區域 + 暫存器區域。HOST1 可以使用 EP 控制器 1 的 BAR0 訪問配置區域和暫存器區域(自用暫存器)。對等主機(連線到 EP 控制器 2 的 HOST2)也可以使用 EP 控制器 2 的 BAR1 訪問此暫存器區域(對等暫存器)。此圖顯示了為 HOST1 分配配置區域和暫存器區域的情況,但同樣適用於 HOST2。

9.6.2.2. 門鈴/記憶體視窗 1 建模:

+-----------------+    +----->+----------------+-----------+-----------------+
|       BAR0      |    |      |   Doorbell 1   +-----------> MSI-X ADDRESS 1 |
+-----------------+    |      +----------------+           +-----------------+
|       BAR1      |    |      |   Doorbell 2   +---------+ |                 |
+-----------------+----+      +----------------+         | |                 |
|       BAR2      |           |   Doorbell 3   +-------+ | +-----------------+
+-----------------+----+      +----------------+       | +-> MSI-X ADDRESS 2 |
|       BAR3      |    |      |   Doorbell 4   +-----+ |   +-----------------+
+-----------------+    |      |----------------+     | |   |                 |
|       BAR4      |    |      |                |     | |   +-----------------+
+-----------------+    |      |      MW1       +---+ | +-->+ MSI-X ADDRESS 3||
|       BAR5      |    |      |                |   | |     +-----------------+
+-----------------+    +----->-----------------+   | |     |                 |
  EP CONTROLLER 1             |                |   | |     +-----------------+
                              |                |   | +---->+ MSI-X ADDRESS 4 |
                              +----------------+   |       +-----------------+
                               EP CONTROLLER 2     |       |                 |
                                 (OB SPACE)        |       |                 |
                                                   +------->      MW1        |
                                                           |                 |
                                                           |                 |
                                                           +-----------------+
                                                           |                 |
                                                           |                 |
                                                           |                 |
                                                           |                 |
                                                           |                 |
                                                           +-----------------+
                                                            PCI Address Space
                                                            (Managed by HOST2)

上圖顯示了門鈴和記憶體視窗 1 如何對映,以便 HOST1 可以在 HOST2 上觸發門鈴中斷,以及 HOST1 如何使用記憶體視窗 1 (MW1) 訪問 HOST2 暴露的緩衝區。此處,門鈴和記憶體視窗 1 區域分配在 EP 控制器 2 的出站 (OB) 地址空間中。為門鈴和記憶體視窗 1 分配和配置 BARs 是在 NTB 端點功能驅動程式的初始化階段完成的。從 EP 控制器 2 的 OB 空間到 PCI 地址空間的對映是在 HOST2 傳送 CMD_CONFIGURE_MW/CMD_CONFIGURE_DOORBELL 時完成的。

9.6.2.3. 可選記憶體視窗建模:

這與 MW1 的建模方式相同,但每個額外的記憶體視窗都對映到單獨的 BARs。