互联网服务IBMWebSphereWAS

websphere 集群节点停止后不能启动

我的was  环境Name                  IBM WebSphere Application Server Network DeploymentVersion               8.5.5.1


集群节点在昨天重启了1个节点,重启后SystemOut.log 一直报如下,SystemErr.log,日志为空,没有报错。
[6/5/14 12:56:05:669 CST] 00000050 Peer          I   ODCF8531I: Added neighbor ip=172.16.4.251 udp=11017 tcp=11018 ID=ba5ba277e25db81569ca2e04106d34f18cebca4e version=0;cellName=sezhun_was01Cell01;bridgedCells=[];structuredGateway=false;properties={inOdc=1, epoch=1401865634974, MEMBER_STARTUP_TIME=1401865632491, memberName=sezhun_was01Cell01sezhun_was01Node01main2511, MEMBER_VERSION=4}, neighbor set size is now 5.
[6/5/14 12:56:15:268 CST] 00000053 NGUtil$Server I   ASND0002I: Detected server common83 started on node 915envNode01
[6/5/14 12:56:15:578 CST] 00000053 NGUtil$Server I   ASND0002I: Detected server main2511 started on node sezhun_was01Node01
[6/5/14 12:56:15:579 CST] 00000053 NGUtil$Server I   ASND0002I: Detected server nodeagent started on node 915envNode01
[6/5/14 12:56:15:580 CST] 00000053 NGUtil$Server I   ASND0002I: Detected server customer2511 started on node sezhun_was01Node01
[6/5/14 12:56:29:069 CST] 00000053 NGUtil$Server I   ASND0002I: Detected server nodeagent started on node sezhun_was01Node01
[6/5/14 12:56:29:318 CST] 0000003b MbuRmmAdapter I   DCSV1032I: DCS Stack DefaultCoreGroup at Member sezhun_was01Cell01915envNode01house83: Connected a defined member sezhun_was01Cell01sezhun_was01CellManager01dmgr.
[6/5/14 12:56:36:417 CST] 00000052 NGUtil$Server I   ASND0002I: Detected server dmgr started on node sezhun_was01CellManager01
[6/5/14 12:57:00:890 CST] 0000004a RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 12:58:00:894 CST] 0000004a RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 12:59:00:898 CST] 00000048 RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 13:00:00:903 CST] 0000004b RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 13:01:00:907 CST] 0000004a RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 13:02:00:910 CST] 0000004b RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 13:03:00:913 CST] 0000004b RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 13:04:00:919 CST] 00000049 RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 13:05:00:923 CST] 0000004b RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 13:05:56:411 CST] 0000004b CoreGroupMemb I   DCSV0008I: DCS Stack DefaultCoreGroup at Member sezhun_was01Cell01915envNode01house83: General information message. Internal details: Failed to Form Inital View. Number of tries= 4 [sezhun_was01Cell01915envNode01batch_server1 sezhun_was01Cell01915envNode01common83 sezhun_was01Cell01915envNode01house83 sezhun_was01Cell01915envNode01nodeagent sezhun_was01Cell01sezhun_was01CellManager01dmgr sezhun_was01Cell01sezhun_was01Node01customer2511 sezhun_was01Cell01sezhun_was01Node01main2511 sezhun_was01Cell01sezhun_was01Node01nodeagent]
[6/5/14 13:06:00:926 CST] 0000004b RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server
[6/5/14 13:07:00:929 CST] 0000004b RLSHAGroupCal W   CWRLS0030W: Waiting for HAManager to activate recovery processing for local WebSphere server


经过多次反复重启,dmgr,node,server,同步节点,都没有效果,后查看到http://www-01.ibm.com/support/docview.wss?uid=swg21226704,发现情况一致,但是按照上面的步骤检查了,已没有冲突的端口。
希望牛人给予指点。如果遇到类似问题的就最好了。
参与2

1同行回答

fhqsse220fhqsse220系统运维工程师链家
经过一些列的解决测试,都没有成功后。解决方式:考虑到我之前删除了一个测试集群的集群成员,而这个报错和集群相关,于是把那个测试机群的集群 和对应的测试应用全部删除后,重启即可。此次事故价值不大,结论是以后 凡是要按规则来,不要没事手多修改配置。...显示全部
经过一些列的解决测试,都没有成功后。
解决方式:考虑到我之前删除了一个测试集群的集群成员,而这个报错和集群相关,于是把那个测试机群的集群 和对应的测试应用全部删除后,重启即可。

此次事故价值不大,结论是以后 凡是要按规则来,不要没事手多修改配置。收起
互联网服务 · 2014-06-05
浏览9278

提问者

fhqsse220
系统运维工程师链家

相关问题

相关资料

相关文章

问题状态

  • 发布时间:2014-06-05
  • 关注会员:1 人
  • 问题浏览:17651
  • 最近回答:2014-06-05
  • X社区推广