我一直把注意集中在了心跳上,但是看来还是有其他因素的。下面两个主机的报错请大家看下。可能是因为定时作业的 exp作业:
我1年前维护时时无规律的宕机,当时处理心跳解决了。现在刚接手,听说近俩月又开始频繁重启了。贴日志给大家:
节点1:
[rcy55a01][root][/oraexp]#errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
A6DF45AA 0616022013 I O RMCdaemon The daemon is started.
2BFA76F6 0616021813 T S SYSPROC SYSTEM SHUTDOWN BY USER
9DBCFDEE 0616021913 T O errdemon ERROR LOGGING TURNED ON
EC0BCCD4 0614151113 T H ent1 ETHERNET DOWN
F3931284 0614150913 I H ent1 ETHERNET NETWORK RECOVERY MODE
F3931284 0614150713 I H ent3 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614150713 T H ent3 ETHERNET DOWN
EC0BCCD4 0614145713 T H ent1 ETHERNET DOWN
F3931284 0614145713 I H ent1 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614145713 T H ent1 ETHERNET DOWN
F3931284 0614145613 I H ent1 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614145413 T H ent1 ETHERNET DOWN
F3931284 0614145413 I H ent1 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614145413 T H ent1 ETHERNET DOWN
F3931284 0614145313 I H ent1 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614145213 T H ent1 ETHERNET DOWN
F3931284 0614145213 I H ent1 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614145213 T H ent1 ETHERNET DOWN
F3931284 0614145113 I H ent1 ETHERNET NETWORK RECOVERY MODE
F3931284 0614144913 I H ent3 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614144813 T H ent3 ETHERNET DOWN
A6DF45AA 0614022213 I O RMCdaemon The daemon is started.
EC0BCCD4 0614022213 T H ent1 ETHERNET DOWN
2BFA76F6 0614022013 T S SYSPROC SYSTEM SHUTDOWN BY USER
9DBCFDEE 0614022213 T O errdemon ERROR LOGGING TURNED ON
节点2:
[rcy55a02][root][/]#errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
F3931284 0616021913 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0616021913 T H ent0 ETHERNET DOWN
F3931284 0616021713 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0616021713 T H ent0 ETHERNET DOWN
F3931284 0616021713 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0616021713 T H ent0 ETHERNET DOWN
F3931284 0616021713 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0616021713 T H ent0 ETHERNET DOWN
F3931284 0616021713 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0616021713 T H ent0 ETHERNET DOWN
F3931284 0616021713 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0616021713 T H ent0 ETHERNET DOWN
F3931284 0614150713 I H ent3 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614150713 T H ent3 ETHERNET DOWN
F3931284 0614144913 I H ent3 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614144813 T H ent3 ETHERNET DOWN
F3931284 0614022413 I H ent3 ETHERNET NETWORK RECOVERY MODE
F3931284 0614022413 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614022413 T H ent3 ETHERNET DOWN
EC0BCCD4 0614022413 T H ent0 ETHERNET DOWN
F3931284 0614022213 I H ent0 ETHERNET NETWORK RECOVERY MODE
F3931284 0614022213 I H ent3 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614022213 T H ent0 ETHERNET DOWN
F3931284 0614022213 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614022213 T H ent0 ETHERNET DOWN
F3931284 0614022213 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614022213 T H ent0 ETHERNET DOWN
F3931284 0614022213 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614022213 T H ent0 ETHERNET DOWN
F3931284 0614022213 I H ent0 ETHERNET NETWORK RECOVERY MODE
EC0BCCD4 0614022213 T H ent3 ETHERNET DOWN
EC0BCCD4 0614022213 T H ent0 ETHERNET DOWN
上面2点以后有个定时作业,是ORACLE的EXP,然后GZIP。宕机时正是在GZIP。定时作业只周一,周五。周日做。正好有宕机。
刚重新维护这系统,再观察下吧。我把定时作业的GZIP去掉了。观察下吧。希望能够解决问题。
另外实在不行,我考虑通过交换机来做心跳,不知道效果会不会好。
收起