IBM X3650 M4 PCI插槽报错,故障原因是什么?

操作面板PCI指示灯告警,查看日志,内容如下:
Index,Severity,Service State,Source,Date,Sequence #,Event ID,Message,AuxLog
1,E,3,System,06/19/2017 03:00:20.36,416,806F002130010902,"Fault in slot 2 on system System x3650 M4.",[S.2018001] An uncorrected PCIe error has occurred on Bus 0020 Device 00 Function 00. The Vendor ID for the device is 10DF and the Device ID is F0E5.
2,E,3,System,06/19/2017 03:00:19.943,415,806F002125820900,"Fault in slot All PCI Error on system System x3650 M4.",[S.68005] An error has been detected by the IIO core logic on Bus 00. The Global Fatal Error Status register contains 00000000. The Global Non-Fatal Error Status register contains 00004000. Please check error logs for the presence of additional downstream device error data.
3,E,3,System,06/19/2017 03:00:15.975,414,806F08132582FFFF,"An Uncorrectable Bus Error has occurred on bus PCIs.",
4,E,2,System,06/19/2017 03:00:15.842,413,806F03131701FFFF,"A software NMI has occurred on system System x3650 M4.",
5,E,3,System,06/13/2017 10:03:42.524,388,806F002130010902,"Fault in slot 2 on system System x3650 M4.",[S.2018001] An uncorrected PCIe error has occurred on Bus 0020 Device 00 Function 00. The Vendor ID for the device is 10DF and the Device ID is F0E5.
6,E,3,System,06/13/2017 10:03:42.414,387,806F002125820900,"Fault in slot All PCI Error on system System x3650 M4.",[S.68005] An error has been detected by the IIO core logic on Bus 00. The Global Fatal Error Status register contains 00000000. The Global Non-Fatal Error Status register contains 00004000. Please check error logs for the presence of additional downstream device error data.
7,E,3,System,06/13/2017 10:03:38.474,386,806F08132582FFFF,"An Uncorrectable Bus Error has occurred on bus PCIs.",
8,E,2,System,06/13/2017 10:03:38.248,385,806F03131701FFFF,"A software NMI has occurred on system System x3650 M4.",
9,E,2,Power,06/13/2017 09:10:39.587,368,800B01081301FFFF,"Redundancy Lost for Power Unit has asserted.",

报错1.jpg

报错1.jpg

报错2.jpg
报错2.jpg

请教是什么原因?该怎么处理

3回答

YTXYGGYTXYGG  系统架构师 , 北京亿通新月
luxiaojuniccsswangyan_1赞同了此回答
bios里面关rom 还不行就换卡显示全部

bios里面关rom 还不行就换卡

收起
 2018-03-28
浏览8566
lipeng9239lipeng9239  系统运维工程师 , 北京智控美信
aixchina赞同了此回答
进Hardware Information看看slot 2的是什么板卡,拔掉slot 2的板卡重新引导。显示全部

进Hardware Information看看slot 2的是什么板卡,拔掉slot 2的板卡重新引导。

收起
 2018-03-28
浏览8412
  • slot 2 是一块HBA卡
    2018-03-28
  • lipeng9239  lipeng9239 回复 lp0618
    拔掉HAB卡,系统能启动IMM不报错,那就是卡坏了,换个槽位换上去,如果仍然报错,那就考虑机箱内的IO版,Riser板等等的。
    2018-03-28
songdeyongsongdeyong  系统工程师 , 北京翰海五洲公司
看看是不是slot2上的这个卡的问题。显示全部

看看是不是slot2上的这个卡的问题。

收起
 2018-03-28
浏览8737

提问者

lp0618系统工程师, 山西中天云泰有限公司

问题状态

  • 发布时间:2018-03-28
  • 关注会员:4 人
  • 问题浏览:12327
  • 最近回答:2018-03-28
  • 关于TWT  使用指南  社区专家合作  厂商入驻社区  企业招聘  投诉建议  版权与免责声明  联系我们
    © 2020  talkwithtrend — talk with trend,talk with technologist 京ICP备09031017号-30