互联网服务powerLinux

Power system and NUMA

Symmetric multiprocessing (SMP) architecture allows a system to scale beyond oneprocessor. Each processor is connected to the same bus (also known as crossbar switch) toaccess the main memory. But this computation scaling is not infinite due to the f...显示全部
Symmetric multiprocessing (SMP) architecture allows a system to scale beyond one
processor. Each processor is connected to the same bus (also known as crossbar switch) to
access the main memory. But this computation scaling is not infinite due to the fact that each
processor needs to share the same memory bus, so access to the main memory is serialized.
With this limitation, this kind of architecture can scale up to four to eight processors only
(depending on the hardware).


The Non-Uniform Memory Access (NUMA) architecture is a way to partially solve the SMP
scalability issue by reducing pressure on the memory bus.
As opposed to the SMP system, NUMA adds the notion of a multiple memory subsystem
called
NUMA node:
 Each node is composed of processors sharing the same bus to access memory (a node
can be seen as an SMP system).
 NUMA nodes are connected using a special “interlink bus” to provide processor data
coherency across the entire system.
Each processor can have access to the entire memory of a system; but access to this
memory is not uniform (Figure 2-2 on page 11):
 Access to memory located in the same node (local memory) is direct with a very low
latency.
 Access to memory located in another node is achieved through the interlink bus with a
higher latency.
By limiting the number of processors that directly access the entire memory, performance is
improved compared to an SMP because of the much shorter queue of requests on each
memory domain.




The architecture design of the Power platform is mostly NUMA with three levels:
 Each POWER7 chip has its own memory dimms. Access to these dimms has a very low
latency and is named
local.
 Up to four POWER7 chips can be connected to each other in the same CEC (or node) by
using X, Y, Z buses from POWER7. Access to memory owned by another POWER7 chip
in the same CEC is called
near or remote. Near or remote memory access has a higher
latency compared than local memory access.
 Up to eight CECs can be connected through A, B buses from a POWER7 chip (only on
high-end systems). Access to memory owned by another POWER7 in another CEC (or
node) is called
far or distant. Far or distant memory access has a higher latency than
remote memory access.



Summary: Power Systems can have up to three different latency memory accesses
(Figure 2-3). This memory access time depends on the memory location relative to a
processor.
Latency access time (from lowest to highest): local  near or remote  far or distant.收起
参与1

返回lemon的回答

lemonlemon销售管理北京长得万众信息技术有限公司
啥时候能有2万积分啊
IT分销/经销 · 2013-10-23
浏览1041

回答者

lemon
销售管理北京长得万众信息技术有限公司

lemon 最近回答过的问题

回答状态

  • 发布时间:2013-10-23
  • 关注会员:0 人
  • 回答浏览:1041
  • X社区推广