E850主机,3块HBA卡,其中两块通过EMC 8510交换,连接Vmax40K和EMC 400F存储。
某天一块HBA卡开始报错,如下:
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
DC73C03A 0816223816 T S fscsi0 SOFTWARE PROGRAM ERROR
DC73C03A 0816223816 T S fscsi0 SOFTWARE PROGRAM ERROR
DC73C03A 0816223816 T S fscsi0 SOFTWARE PROGRAM ERROR
DC73C03A 0816223816 T S fscsi0 SOFTWARE PROGRAM ERROR
ECCE4018 0816210116 T S fcs2 SOFTWARE PROGRAM ERROR
DCB47997 0816132016 T H hdisk96 DISK OPERATION ERROR
DCB47997 0815115716 T H hdisk94 DISK OPERATION ERROR
DCB47997 0815115616 T H hdisk50 DISK OPERATION ERROR
DC73C03A 0812205716 T S fscsi0 SOFTWARE PROGRAM ERROR
DC73C03A 0812205716 T S fscsi0 SOFTWARE PROGRAM ERROR
DC73C03A 0812205616 T S fscsi0 SOFTWARE PROGRAM ERROR
DC73C03A 0812205616 T S fscsi0 SOFTWARE PROGRAM ERROR
DC73C03A 0812205616 T S fscsi0 SOFTWARE PROGRAM ERROR
DC73C03A 0812205616 T S fscsi0 SOFTWARE PROGRAM ERROR
详细查看错误:
LABEL: FCP_ERR6
IDENTIFIER: DC73C03A
Date/Time: Tue Aug 16 22:38:40 CST 2016
Sequence Number: 1733
Machine Id: 00FA17FA4C00
Node Id: sop01
Class: S
Type: TEMP
WPAR: Global
Resource Name: fscsi0
Description
SOFTWARE PROGRAM ERROR
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
打400电话报修,各种收集日志,主机的,san交换的,存储的,IBM和EMC反复扯皮推诿,都说是对方的原因,经过一番扯皮,最后决定由IBM来更换HBA卡,卡更换之后,fscsi0确实不在报错了。
没想到几天后,另一块连接存储的HBA卡也跟着凑热闹:
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
DC73C03A 0916202416 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916202416 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916202316 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916202316 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916202016 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916202016 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916201916 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916201916 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916201316 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916201316 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916201316 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916201316 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916194716 T S fscsi2 SOFTWARE PROGRAM ERROR
DC73C03A 0916194716 T S fscsi2 SOFTWARE PROGRAM ERROR
详细信息:
LABEL: FCP_ERR6
IDENTIFIER: DC73C03A
Date/Time: Fri Sep 16 20:24:11 CST 2016
Sequence Number: 3436
Machine Id: 00FA17FA4C00
Node Id: sop01
Class: S
Type: TEMP
WPAR: Global
Resource Name: fscsi2
Description
SOFTWARE PROGRAM ERROR
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
报修IBM,又要换卡。。。。
请高手指点一下,我这一批一共买了八台E850,怎么就这台的两块卡都出问题,这个概率太有问题了,会不会是其他原因导致的?这台主机的业务很重要,停机换卡很麻烦,我怕这次换过之后第3块hba卡再报错。。。DC73C03A到底是什么错误,
第一次换卡是IBM主动换的,还是行方要求更换的,如果是IBM主动换的那么厂商有没有说明什么问题,如果是批次问题,应该是全部更换才对.
你们连EMC 400F都上了,E850 400F都是上市不久的产品,不排除兼容性问题,找厂商继续解决问题吧