各位好,
我现在有套系统,P595,AIX 5306. 主备HA 5.4,使用的存储是HDS VSP。 昨天出了个问题,主节点因为SAN网络原因切换了,在主节点上出现了2条垃圾链路(HDLM多路径),并且有一条链路不能恢复。于是我把所有的盘都删除,然后把垃圾链路删除,重启主机。现在链路的状态都正常了。
但是出现个更诡异的问题:
#lspv
hdisk0 00c7b19f0313f4eb rootvg active
hdisk1 00c7b19f2dd00770 rootvg active
hdisk2 none None
hdisk3 none None
hdisk4 none None
hdisk5 none None
hdisk6 none None
hdisk7 none None
hdisk8 none None
hdisk9 none None
hdisk10 none None
hdisk11 none None
hdisk12 none None
hdisk13 none None
hdisk14 none None
hdisk15 none None
hdisk16 none None
hdisk17 none None
hdisk18 none None
hdisk19 none None
hdisk20 none None
hdisk21 none None
hdisk22 none None
hdisk23 none None
hdisk24 none None
hdisk25 none None
hdisk26 none None
hdisk27 none None
hdisk28 none None
hdisk29 none None
hdisk30 none None
hdisk31 none None
hdisk32 none None
hdisk33 none None
dlmfdrv0 none None
dlmfdrv1 none None
dlmfdrv2 none None
dlmfdrv3 none None
dlmfdrv4 none None
dlmfdrv5 none None
dlmfdrv6 none None
dlmfdrv7 none None
dlmfdrv8 none None
dlmfdrv9 none None
dlmfdrv10 none None
dlmfdrv11 none None
dlmfdrv12 none None
dlmfdrv13 none None
dlmfdrv14 none None
dlmfdrv15 none None
#cd /usr/D*/bin
#./dlnkmgr view -lu
Product : USP
SerialNumber : 0017109
LUs : 16
iLU HDevName Device PathID Status
0010 dlmfdrv0 hdisk2 000000 Online
hdisk18 000001 Online
0011 dlmfdrv1 hdisk3 000002 Online
hdisk19 000003 Online
0012 dlmfdrv2 hdisk4 000004 Online
hdisk20 000005 Online
0013 dlmfdrv3 hdisk5 000006 Online
hdisk21 000007 Online
0014 dlmfdrv4 hdisk6 000008 Online
hdisk22 000009 Online
0015 dlmfdrv5 hdisk7 000010 Online
hdisk23 000011 Online
0016 dlmfdrv6 hdisk8 000012 Online
hdisk24 000013 Online
0017 dlmfdrv7 hdisk9 000014 Online
hdisk25 000015 Online
0018 dlmfdrv8 hdisk10 000016 Online
hdisk26 000017 Online
0019 dlmfdrv9 hdisk11 000018 Online
hdisk27 000019 Online
001A dlmfdrv10 hdisk12 000020 Online
hdisk28 000021 Online
001B dlmfdrv11 hdisk13 000022 Online
hdisk29 000023 Online
001C dlmfdrv12 hdisk14 000024 Online
hdisk30 000025 Online
001D dlmfdrv13 hdisk15 000026 Online
hdisk31 000027 Online
001E dlmfdrv14 hdisk16 000028 Online
hdisk32 000029 Online
001F dlmfdrv15 hdisk17 000030 Online
hdisk33 000031 Online
KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2015/04/23 11:20:37
#
现在问题来了。
#chdev -l dlmfdrv0 -a pv=yes
Method error (/usr/lib/methods/chgdlmfdrv):
0514-010 Error returned from odm_run_method.
pv
pv
nmykc2#lquerypv -h /dev/dlmfdrv0
nmykc2#lquerypv -h /dev/hdisk0 ---系统盘
00000000 C9C2D4C1 00000000 00000000 00000000 |................|
00000010 00000000 00000000 00000000 00000000 |................|
00000020 00000000 00000000 00000000 00000000 |................|
00000030 00000000 00000000 00000000 00000000 |................|
00000040 00000000 00000000 00000000 00000000 |................|
00000050 00000000 00000000 00000000 00000000 |................|
00000060 00000000 00000000 00000000 00000000 |................|
00000070 00000000 00000000 00000000 00000000 |................|
00000080 00C7B19F 0313F4EB 00000000 00000000 |................|
00000090 00000000 00000000 00000000 00000000 |................|
000000A0 00000000 00000000 00000000 00000000 |................|
000000B0 00000000 00000000 00000000 00000000 |................|
000000C0 00000000 00000000 00000000 00000000 |................|
000000D0 00000000 00000000 00000000 00000000 |................|
000000E0 00000000 00000000 00000000 00000000 |................|
000000F0 00000000 00000000 00000000 00000000 |................|
#lquerypv -h /dev/hdisk2
#
=====================================正常节点===================================
现在业务在另外一个节点上跑着,应用正常。
dlmfdrv0 00c7b18f47a6c899 archive_vg active
dlmfdrv1 00c7b18f47b1649a ykcdata_vg01 active
dlmfdrv2 00c7b18f47b165fc ykcdata_vg01 active
dlmfdrv3 00c7b18f47efacd1 ykcdata_vg02 active
dlmfdrv4 00c7b18f47efae54 ykcdata_vg02 active
dlmfdrv5 00c7b18f4805bd51 onlinebackup_vg active
dlmfdrv6 00c7b18f4805bf99 onlinebackup_vg active
dlmfdrv7 00c7b18f4805c1c9 onlinebackup_vg active
dlmfdrv8 none None
dlmfdrv9 none None
dlmfdrv10 none None
dlmfdrv11 none None
hdisk28 none None
hdisk29 none None
hdisk26 none None
hdisk27 none None
dlmfdrv12 00c7b18f9ab73d16 testvg
dlmfdrv13 00c7b18f9ab75803 testvg
hdisk30 none None
hdisk31 none None
hdisk32 none None
hdisk33 none None
hdisk34 none None
hdisk35 none None
hdisk36 none None
hdisk37 none None
dlmfdrv14 00c7b19f766ff818 onlinebackup_vg active
dlmfdrv15 00c7b19f76718518 onlinebackup_vg active
dlmfdrv16 00c7b19f7672e945 onlinebackup_vg active
dlmfdrv17 00c7b19f76730dac archive_vg active
正常节点的VGDA:
#lqueryvg -Atp dlmfdrv14
Max LVs: 512
PP Size: 28
Free PPs: 415
LV count: 2
PV count: 6
Total VGDAs: 6
Conc Allowed: 0
MAX PPs per PV 32768
MAX PVs: 1024
Quorum (disk): 0
Quorum (dd): 0
Auto Varyon ?: 0
Conc Autovaryo 0
Varied on Conc 0
Logical: 00c7b18f00004c00000001244805d5e0.1 lvonlinebak 1
00c7b18f00004c00000001244805d5e0.2 loglv01 1
Physical: 00c7b18f4805bd51 1 0
00c7b18f4805bf99 1 0
00c7b18f4805c1c9 1 0
00c7b19f766ff818 1 0
00c7b19f76718518 1 0
00c7b19f7672e945 1 0
Total PPs: 3216
LTG size: 128
HOT SPARE: 0
AUTO SYNC: 0
VG PERMISSION: 0
SNAPSHOT VG: 0
IS_PRIMARY VG: 0
PSNFSTPP: 7680
VARYON MODE: 0
VG Type: 2
Max PPs: 32768
----故障节点
#lqueryvg -Atp dlmfdrv14
0516-304 lqueryvg: Unable to find device id dlmfdrv14 in the Device
Configuration Database.
0516-024 lqueryvg: Unable to open physical volume.
Either PV was not configured or could not be opened. Run
diagnostics.
收起