触点数字孪生,揭秘它的独特魅力
852
2022-10-14
Xeon E5-2600 v2服务器微处理器Uncore核心性能监控——UNCORE PERFORMANCE MONITORING
UNCORE PER-SOCKET PERFORMANCE MONITORING CONTROL
为了管理分布在许多单元上的大量计数器寄存器并有效地收集事件数据,本节介绍了用于在监视会话期间开始/停止/重新启动事件计数的分层技术,该事件计数可能需要软件代理执行。
Counter Overflow
如果box的计数器溢出,它可以向全局PMON管理器(UBox)发送溢出消息。 为此,必须允许带有溢出计数器的box广播溢出消息(单个计数器的控制寄存器中的.ov_en必须设置为1)。 然后溢出将拾取,并将发送溢出的box记录在UBox中。 可以将具有性能监视器的非核心中的每个box配置为通过以下两个基本操作来响应此溢出:
Freezing on Counter Overflow
PMI on Counter Overflow
收到溢出消息后,UBox可以将PMI信号发送给执行监视软件的核心。 为此,必须将U_MSR_PMON_GLOBAL_CTL.pmi_core_sel文件设置为指向要在其上执行监视软件的核心。
Setting up a Monitoring Session
硬件复位时,所有计数器均被禁用。 启用是分层的。 因此,必须采取以下步骤(包括对事件控制寄存器进行编程并允许计数器开始收集事件)来建立监视会话。 2.1.3节介绍了在监视会话期间停止/重新启动计数器寄存器的步骤。 Global Settings in the UBox: a) Freeze all the uncore counters by setting U_MSR_PMON_GLOBAL_CTL.frz_all to 1 OR (if box level freeze control preferred): a) Freeze the box’s counters while setting up the monitoring session. e.g., set Cn_MSR_PMON_BOX_CTL.frz to 1 For each event to be measured within each box: b) Enable counting for each monitor e.g. Set C0_MSR_PMON_CTL2.en to 1 c) Select event to monitor if the event control register hasn’t been programmed: Program the .ev_sel and .umask bits in the control register with the encodings necessary to capture the requested event along with any signal conditioning bits (.thresh/.edge_det) used to qualify the event. e.g. Set C0_MSR_PMON_CT2.{ev_sel, umask} to {0x03, 0x1} in order to capture LLC_VICTIMS.M_STATE in CBo 0’s C0_MSR_PMON_CTR2. Back to the box level: d) Reset counters in each box to ensure no stale values have been acquired from previous sessions. Resetting the control registers, particularly those that won’t be used is also recommended if for no other reason than to prevent errant overflows. To reset both the counters and control registers, write the following registers: • For each CBo, set Cn_MSR_PMON_BOX_CTL[1:0] to 0x3. • Set HA_PCI_PMON_BOX_CTL[1:0] to 0x3. • For each Intel® QPI Port, set Q_Py_PCI_PMON_BOX_CTL[1:0] to 0x3. • For each DRAM Channel, set MC_CHy_PCI_PMON_BOX_CTL[1:0] to 0x3. • Set PCU_MSR_PMON_BOX_CTL[1:0] to 0x3. • For each Link, set R3_Ly_PCI_PMON_BOX_CTL[1:0] to 0x3. • Set R2_PCI_PMON_BOX_CTL[1:0] to 0x3. Monitoring: e) Select how to gather data. If polling, skip to f. If sampling: To set up a sample interval, software can pre-program the data register with a value of [2^(register bit width - up to 48) - sample interval length]. Doing so allows software, through use of the pmi mechanism, to be notified when the number of events in the sample have been captured. Capturing a performance monitoring sample every ‘X cycles’ (the fixed counter in the UBox counts uncore clock cycles) is a common use of this mechanism. i.e. To stop counting and receive notification when the 1,000,000th idle flit is transmitted from QPI on Port 0
set Q_P0_PCI_PMON_CTR1 to (2^48- 1000)set Q_P0_PCI_PMON_CTL1.ev_sel to 0x0set Q_P0_PCI_PMON_CTL1.umask to 0x1set U_MSR_PMON_GLOBAL_CTL.pmi_core_sel to which core the monitoring thread is executing on.f) Enable counting at the global level by setting the U_MSR_PMON_GLOBAL_CTL.unfrz_all bit to 1.ORf) Enable counting at the box level by unfreezing the counters in each boxe.g. set Cn_MSR_PMON_BOX_CTL.frz to 0And with that, counting will begin.
Reading the Sample Interval
Software can poll the counters whenever it chooses, or wait to be notified that a counter has overflowed (by receiving a PMI). a) Polling - before reading, it is recommended that software freeze the counters in each box with active counters (by setting PMON_BOX_CTL.frz to 1). After reading the event counts from the counter registers, the monitoring agent can choose to reset the event counts to avoid event-count wrap-around; or resume the counter register without resetting their values. The latter choice will require the monitoring agent to check and adjust for potential wrap-around situations. b) Frozen counters - If software set the counters to freeze on overflow and send notification when it happens, the next question is: Who caused the freeze? Overflow bits are stored hierarchically within the uncore. First, software should read the U_MSR_PMON_GLOBAL_STATUS.ov bits to determine which box(es) sent an overflow. Then read that box’s *_PMON_GLOBAL_STATUS.ov field to find the overflowing counter.
Enabling a New Sample Interval from Frozen Counters
a) Clear all uncore counters: For each box in which counting occurred, set PMON_BOX_CTL.rst_ctrs to 1. b) Clear all overflow bits. This includes clearing U_MSR_PMON_GLOBAL_STATUS.ov as well as any *_BOX_STATUS registers that have their overflow bits set. e.g. If counter 3 in QPI Port 1 overflowed, in order to clear the overflow bit software should set Q_P1_PCI_PMON_BOX_STATUS.ov[3] to 1. c) Create the next sample: Reinitialize the sample by setting the monitoring data register to (2^48 - sample_interval). Or set up a new sample interval as outlined in Section 2.1.2, “Setting up a Monitoring Session”. d) Re-enable counting: Set U_MSR_PMON_GLOBAL_CTL.unfrz_all to 1.
Global Performance Monitors
Global PMON Global Control/Status Registers
The following registers represent state governing all PMUs in the uncore, both to exert global control and collect box-level information. U_MSR_PMON_GLOBAL_CTL contains a bit that can freeze (.frz_all) all the uncore counters. If an overflow is detected in any of the uncore’s PMON registers, it will be summarized in U_MSR_PMON_GLOBAL_STATUS. This register accumulates overflows sent to it from the other uncore boxes. To reset these overflow bits, a user must set the corresponding bits in U_MSR_PMON_GLOBAL_STATUS to 1, which will act to clear them.
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。