Non-Maskable Interrupt (NMI) or Peripheral Component Interconnect (PCI) error occurs when Intel Xeon Phi Coprocessor 5110P is Installed - IBM iDataPlex



Source

RETAIN tip: H21425

Symptom

Intel Xeon Phi Coprocessor 5110p hangs after exiting PC3 or DPC3 power state.

This behavior may last from a few hours to a few days. The system with the following Non-Maskable Interrupt (NMI) or Peripheral Component Interconnect (PCI) error in Integrated Management Module (IMM) log:

  148 05/12/2013 04:13:18 Group (not a physical entity) 130 (Slot/Connector - All PCI Error): Assertion: Fault Status asserted.

149 05/12/2013 04:13:18 PCI Bus 4 (Slot/Connector - PCI 4): Assertion: Fault Status asserted.

150 05/12/2013 04:13:20 System chassis 1 (Critical Interrupt - NMI State): Assertion: Software NMI.

Affected configurations

The system may be any of the following IBM servers:

  • iDataPlex dx360 M4 2U chassis, type 7913, any model
  • iDataPlex dx360 M4 server, type 7912, any model

This tip is not software specific.

This tip is not option specific.

Solution

Avoid the hang by disabling Power Management Deep Package State C3(DPC3), Package C3(PC3).

Edit the '/etc/sysconfig/mic/micN.conf' files.

  where 'N' = coprocessor # in system (starting with 0).

For example, to disable Power Management (PM) on 'mic0,' edit the line 'PowerManagement' of the '/etc/sysconfig/mic/mic0.conf' file as follows:

  PowerManagement "CPUfreq_on;corec6_off;pc3_off;pc6_off"

Then restart Intel(R) Many Integrated Core (MIC) Platform Software Stack (MPSS):

  # sudo service mpss unload
# sudo service mpss start

Applicable countries and regions

 


Document id:  MIGR-5093210
Last modified:  2013-07-01
Copyright © 2014 IBM Corporation