VSAN 问题处理的基本思路
1、登陆VC
通过网页版本登陆vsphere web client
2、检查VSAN 运行状况
Vc 上选择VSAN 集群-> 监控-> Virtual SAN-> 运行状况 查看是否有报错信
3、检查日志
less /var/log/vmkernel.log | grep -iE ‘error | reset | congestion’
less /var/log/vobd.log | grep -iE ‘error | reset | congestion’
4、检查网络情况
通过以下命令排除网络问题:
esxcfg-nics -l
esxcfg-vmknic -l
esxcli network ip interface ipv4 get
vmkping -I vmk1 9.2.8.154
5、检查VSAN 存储
esxcli vsan cluster get
esxcli vsan storage list | grep CMMD
esxcli vsan storage list | grep SSD
vdq -iH
6、查看VSAN 全局拥堵情况
for ssd in $(localcli vsan storage list |grep "Group UUID"|awk '{print $5}'|sort -u);do echo $ssd;vsish -e get /vmkModules/vsan/lsom/disks/$ssd/info|grep Congestion;done
7、查看VSAN 拥堵明细
for ssd in $(localcli vsan storage list |grep "Group UUID"|awk '{print $5}'|sort -u);do llogTotal=$(vsish -e get /vmkModules/vsan/lsom/disks/$ssd/info|grep "Log space consumed by LLOG"|awk -F \: '{print $2}');plogTotal=$(vsish -e get /vmkModules/vsan/lsom/disks/$ssd/info|grep "Log space consumed by PLOG"|awk -F \\: '{print $2}');llogGib=$(echo $llogTotal |awk '{print $1 / 1073741824}');plogGib=$(echo $plogTotal |awk '{print $1 / 1073741824}');allGibTotal=$(expr $llogTotal \+ $plogTotal|awk '{print $1 / 1073741824}');echo $ssd;echo " LLOG consumption: $llogGib";echo " PLOG consumption: $plogGib";echo " Total log consumption: $allGibTotal";done
8、Unmount 异常的磁盘组
localcli vsan storage diskgroup unmount –ssd=naa.55cdddddffffffff
9、重启ESXI
根据故障判断是否需要重启ESXI
10、验证
验证ESXI和VM 是否恢复
收起常见的日志文件及其用途包括:
vSAN是一种软件定义的存储解决方案,它利用本地服务器上的存储资源来创建一个高性能、高可用性的存储集群。vSAN的架构包括以下组件:
常见的vSAN问题排查包括: