紧急救援--hacmp6.1配置时出现离奇现象

大家好;
          最近实施HACMP6.1时出现这样问题;
环境:power720(2台)+aix7.1+hacmp6.1+ds5020
ip规划:vi /etc/hosts

16.0.0.1    aglzdb1_boot1
16.0.0.2    aglzdb2_boot1
15.0.0.1    aglzdb1_boot2
15.0.0.2    aglzdb2_boot2
10.11.31.1  aglzdb1_per  aglzdb1
10.11.31.2  aglzdb2_per  aglzdb2
10.11.31.3  aglzdb_svc


配置完hacmp;同步ok
display hacmp配置信息是
Cluster Name: aglz_cluster
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 2 node(s) and 2 network(s) defined

NODE aglzdb1:
        Network net_ether_01   
                aglzdb_svc      10.11.31.3
                aglzdb1_boot2   15.0.0.1
                aglzdb1_boot1   16.0.0.1
        Network net_rs232_01   
                aglzdb1_tty0    /dev/tty0

NODE aglzdb2:
        Network net_ether_01   
                aglzdb_svc      10.11.31.3
                aglzdb2_boot1   16.0.0.2
                aglzdb2_boot2   15.0.0.2
        Network net_rs232_01   
                aglzdb2_tty0    /dev/tty0

Resource Group aglz_resource_group
        Startup Policy   Online On Home Node Only
        Fallover Policy  Fallover To Next Priority Node In The List
        Fallback Policy  Fallback To Higher Priority Node In The List
        Participating Nodes      aglzdb1 aglzdb2
        Service IP Label                 aglzdb_svc
        
Total Heartbeats Missed:        0
Cluster Topology Start Time:    10/12/2013 02:22:07

启动双机现象:双机启动后,过程显示ok;只有nfs那个警告;在主节点上10.11.31.3  aglzdb_svc会挂上;但是不到半分钟vip就会消失;两个节点都没有;而且datavg从头到尾都没有varyon

下面是日志:有人说vg可能有问题;我就吧app server和vg从资源组里面踢掉;现在只剩下一个aglzdb_svc;启动时正常;vip没有消失;感觉是vg的问题;我就从存储里面划了一个新lun;重新建一个vg; 现在资源组里面加上这个新添加的卷组;还是vip一会就消失了,这个新卷组一直没有varyon;去掉这个vg;vip是正常的;后面测试期间 app server一直就没设置;

大家给点建议吧

查看hacmp.out里面启动一些日志
HACMP: Additional messages will be logged here as the cluster events are run

:check_for_site_up[+54] [[ high = high ]]
:check_for_site_up[+54] version=1.4
:check_for_site_up[+55] :check_for_site_up[+55] cl_get_path
HA_DIR=es
:check_for_site_up[+57] STATUS=0
:check_for_site_up[+59] set +u
:check_for_site_up[+61] [ ]
:check_for_site_up[+72] exit 0

Oct 12 09:57:27 EVENT START: node_up aglzdb1

:node_up[+137] [[ high = high ]]
:node_up[+137] version=1.10.11.24
:node_up[+139] NFS_CROSS_MOUNT_ENABLE_VAR=
:node_up[+141] export NODENAME=aglzdb1
:node_up[+143] HPS_CMD=/usr/es/sbin/cluster/events/utils/cl_HPS_init
:node_up[+144] typeset -i STATUS=0
:node_up[+145] typeset -i RC=0
:node_up[+148] [[ -z  ]]
:node_up[+150] EMULATE=REAL
:node_up[+153] set -u
:node_up[+155] ((  1 < 1  ))
:node_up[+167] [[ aglzdb1 = aglzdb1 ]]
:node_up[+169] rm -f /usr/es/sbin/cluster/etc/ha_nodehalt.lock
:node_up[+173] [[ 1 -eq 2 ]]
:node_up[+187] [[ REAL == REAL ]]
:node_up[+187] /usr/sbin/rsct/bin/dms/startdms -s topsvcs
Dead Man Switch Enabled
:node_up[+192] [[ FALSE = FALSE ]]
:node_up[+194] echo RG_DEPENDENCY is set to FALSE
RG_DEPENDENCY is set to FALSE
:node_up[+199] set -a
:node_up[+200] clsetenvgrp aglzdb1 node_up
:clsetenvgrp[+50] [[ high = high ]]
:clsetenvgrp[+50] version=1.16
:clsetenvgrp[+52] usingVer=clSetenvgrp
:clsetenvgrp[+57] clSetenvgrp aglzdb1 node_up
executing clSetenvgrp
clSetenvgrp completed successfully
:clsetenvgrp[+58] exit 0
:node_up[+200] eval FORCEDOWN_GROUPS="" RESOURCE_GROUPS="" HOMELESS_GROUPS="" HOMELESS_FOLLOWER_GROUPS="" ERRSTATE_GROUPS="" PRINCIPAL_ACTIONS="" ASSOCIATE_ACTIONS="" AUXILLIARY_ACTIONS=""
:node_up[+200] FORCEDOWN_GROUPS= RESOURCE_GROUPS= HOMELESS_GROUPS= HOMELESS_FOLLOWER_GROUPS= ERRSTATE_GROUPS= PRINCIPAL_ACTIONS= ASSOCIATE_ACTIONS= AUXILLIARY_ACTIONS=
:node_up[+201] RC=0
:node_up[+202] set +a
:node_up[+203] : exit status of clsetenvgrp aglzdb1 node_up is: 0
:node_up[+204] ((  0 != 0  ))
:node_up[+212] rm -f /tmp/.RPCLOCKDSTOPPED
:node_up[+218] process_resources FENCE
:process_resources[2423] [[ high == high ]]
:process_resources[2423] version=1.125
:process_resources[2425] STATUS=0
:process_resources[2426] sddsrv_off=FALSE
:process_resources[2428] true
:process_resources[2430] : call rgpa, and it will tell us what to do next
:process_resources[2432] set -a
:process_resources[2433] clRGPA FENCE
:clRGPA[+49] [[ high = high ]]
:clRGPA[+49] version=1.16
:clRGPA[+51] usingVer=clrgpa
:clRGPA[+56] clrgpa FENCE
2013-10-12T09:57:27.669456 clrgpa
:clRGPA[+57] exit 0
:process_resources[2433] eval JOB_TYPE=NONE
:process_resources[1] JOB_TYPE=NONE
:process_resources[2434] RC=0
:process_resources[2435] set +a
:process_resources[2437] (( 0 != 0 ))
:process_resources[2443] RESOURCE_GROUPS=''
:process_resources[2444] GROUPNAME=''
:process_resources[2444] export GROUPNAME
:process_resources[2748] break
:process_resources[2759] : If sddsrv was turned off above, turn it back on again
:process_resources[2761] [[ FALSE == TRUE ]]
:process_resources[2767] exit 0
:node_up[+228] [[ aglzdb1 = aglzdb1 ]]
:node_up[+228] [[ REAL = EMUL ]]
:node_up[+240] rm -f /usr/es/sbin/cluster/etc/.hacmp_wlm_config_changed
:node_up[+243] cl_wlm_reconfig node_up
:node_up[+243] EMULATE=REAL
:cl_wlm_reconfig[+297] [[ high = high ]]
:cl_wlm_reconfig[+297] version=1.14
:cl_wlm_reconfig[+298] :cl_wlm_reconfig[+298] cl_get_path
HA_DIR=es
:cl_wlm_reconfig[+299] SCD=/usr/es/sbin/cluster/etc/objrepos/stage
:cl_wlm_reconfig[+300] ACD=/usr/es/sbin/cluster/etc/objrepos/active
:cl_wlm_reconfig[+302] EMULATE=REAL
:cl_wlm_reconfig[+304] CALLING_EVENT=node_up
:cl_wlm_reconfig[+306] HA_WLM_CLASSES=
:cl_wlm_reconfig[+308] :cl_wlm_reconfig[+308] awk BEGIN { FS = ":" } $1 !~ /^#.*/ { print $1 }
:cl_wlm_reconfig[+308] /usr/es/sbin/cluster/utilities/clwlmruntime -l -d /usr/es/sbin/cluster/etc/objrepos/active
HA_WLM_CONFIG=HA_WLM_config
:cl_wlm_reconfig[+309] [[ -z HA_WLM_config ]]
:cl_wlm_reconfig[+318] WLM_CONFIG_FILES=classes limits shares rules
:cl_wlm_reconfig[+321] [[ reconfig_resources = node_up ]]
:cl_wlm_reconfig[+326] build_class_list
:cl_wlm_reconfig[build_class_list+4] PRIMARY=
:cl_wlm_reconfig[build_class_list+5] SECONDARY=
:cl_wlm_reconfig[build_class_list+8] GROUP=
:cl_wlm_reconfig[build_class_list+9] NODES=
:cl_wlm_reconfig[build_class_list+10] STARTUP_PREF=
:cl_wlm_reconfig[build_class_list+11] FALLOVER_PREF=
:cl_wlm_reconfig[build_class_list+12] FALLBACK_PREF=
:cl_wlm_reconfig[build_class_list+13] /usr/es/sbin/cluster/utilities/clgetgrp -c
:cl_wlm_reconfig[build_class_list+14] read line
:cl_wlm_reconfig[build_class_list+13] grep -v -E ^#
:cl_wlm_reconfig[build_class_list+16] :cl_wlm_reconfig[build_class_list+16] cut -d: -f1
:cl_wlm_reconfig[build_class_list+16] echo aglz_resource_group::ignore:aglzdb1 aglzdb2:OHN:FNPN:FBHPN: :
GROUP=aglz_resource_group
:cl_wlm_reconfig[build_class_list+17] :cl_wlm_reconfig[build_class_list+17] cut -d: -f4
:cl_wlm_reconfig[build_class_list+17] echo aglz_resource_group::ignore:aglzdb1 aglzdb2:OHN:FNPN:FBHPN: :
NODES=aglzdb1 aglzdb2
:cl_wlm_reconfig[build_class_list+18] :cl_wlm_reconfig[build_class_list+18] cut -d: -f5
:cl_wlm_reconfig[build_class_list+18] echo aglz_resource_group::ignore:aglzdb1 aglzdb2:OHN:FNPN:FBHPN: :
STARTUP_PREF=OHN
:cl_wlm_reconfig[build_class_list+19] :cl_wlm_reconfig[build_class_list+19] cut -d: -f6
:cl_wlm_reconfig[build_class_list+19] echo aglz_resource_group::ignore:aglzdb1 aglzdb2:OHN:FNPN:FBHPN: :
FALLOVER_PREF=FNPN
:cl_wlm_reconfig[build_class_list+20] :cl_wlm_reconfig[build_class_list+20] cut -d: -f7
:cl_wlm_reconfig[build_class_list+20] echo aglz_resource_group::ignore:aglzdb1 aglzdb2:OHN:FNPN:FBHPN: :
FALLBACK_PREF=FBHPN
:cl_wlm_reconfig[build_class_list+20] [[ -z aglz_resource_groupaglzdb1 aglzdb2OHNFNPNFBHPN ]]
:cl_wlm_reconfig[build_class_list+20] [[ OHN = OHN ]]
:cl_wlm_reconfig[build_class_list+20] [[ aglzdb1 = aglzdb1 ]]
:cl_wlm_reconfig[build_class_list+35] PRIMARY= aglz_resource_group
:cl_wlm_reconfig[build_class_list+14] read line
:cl_wlm_reconfig[build_class_list+68] :cl_wlm_reconfig[build_class_list+68] odmget -q group = aglz_resource_group and name = 'WLM_PRIMARY' HACMPresource
:cl_wlm_reconfig[build_class_list+68] sed s/"//g
:cl_wlm_reconfig[build_class_list+68] awk $1 = /value/ { print $3 }
WLM_PRIMARY=
:cl_wlm_reconfig[build_class_list+68] [[ -n  ]]
:cl_wlm_reconfig[+327] [[ -z  ]]
:cl_wlm_reconfig[+329] exit 3
:node_up[+244] WLM_STATUS=3
:node_up[+247] ((  0 == 3  ))
:node_up[+265] :node_up[+265] cl_rrmethods2call ss_load
:cl_rrmethods2call[+49] [[ high = high ]]
:cl_rrmethods2call[+49] version=1.17
:cl_rrmethods2call[+50] :cl_rrmethods2call[+50] cl_get_path
HA_DIR=es
:cl_rrmethods2call[+76] RRMETHODS=
:cl_rrmethods2call[+77] NEED_RR_ENV_VARS=no
:cl_rrmethods2call[+79] [[ aglzdb1 = aglzdb1 ]]
:cl_rrmethods2call[+99] NEED_RR_ENV_VARS=yes
:cl_rrmethods2call[+114] [[ yes = yes ]]
:cl_rrmethods2call[+116] cllsres
:cl_rrmethods2call[+116] 2> /dev/null
:cl_rrmethods2call[+116] eval APPLICATIONS="aglz_server" FILESYSTEM="" FORCED_VARYON="false" FSCHECK_TOOL="fsck" FS_BEFORE_IPADDR="false" RECOVERY_METHOD="sequential" SERVICE_LABEL="aglzdb_svc" SSA_DISK_FENCING="false" VG_AUTO_IMPORT="false" VOLUME_GROUP="datavg"
:cl_rrmethods2call[+116] APPLICATIONS=aglz_server FILESYSTEM= FORCED_VARYON=false FSCHECK_TOOL=fsck FS_BEFORE_IPADDR=false RECOVERY_METHOD=sequential SERVICE_LABEL=aglzdb_svc SSA_DISK_FENCING=false VG_AUTO_IMPORT=false VOLUME_GROUP=datavg
:cl_rrmethods2call[+120] [[ -n  ]]
:cl_rrmethods2call[+125] [[ -n  ]]
:cl_rrmethods2call[+130] [[ -n  ]]
:cl_rrmethods2call[+135] [[ -n  ]]
:cl_rrmethods2call[+140] [[ -n  ]]
:cl_rrmethods2call[+145] echo
:cl_rrmethods2call[+146] exit 0
METHODS=
:node_up[+281] :node_up[+281] odmget -qnodename = aglzdb1 HACMPadapter
:node_up[+281] grep hps
:node_up[+281] grep type
SP_SWITCH=
参与17

16同行回答

airbenderairbender系统工程师太极
回复 15# abit2007     我打的是最新的sp12显示全部
回复 15# abit2007


    我打的是最新的sp12收起
系统集成 · 2013-10-17
浏览1870
abit2007abit2007系统工程师代维
SP09版本的补丁稳定,强烈建议。显示全部
SP09版本的补丁稳定,强烈建议。收起
互联网服务 · 2013-10-16
浏览2166
zealotddvzealotddv售前技术支持四川久远银海
你可以给我啊~哈哈回复 13# airbender 显示全部
你可以给我啊~哈哈回复 13# airbender 收起
软件开发 · 2013-10-16
浏览1911
airbenderairbender系统工程师太极
我该把金币给谁呢?哎,愁死显示全部
我该把金币给谁呢?哎,愁死收起
系统集成 · 2013-10-16
浏览1896
满天星满天星系统工程师********
建议你做一次VG的导入导出,另外资源组配置是否正确,如果都正常的话建议你打一次HA的补丁,可以打到SP07。显示全部
建议你做一次VG的导入导出,另外资源组配置是否正确,如果都正常的话建议你打一次HA的补丁,可以打到SP07。收起
系统集成 · 2013-10-16
浏览1885
airbenderairbender系统工程师太极
回复 4# q5894201     我贴错了,那项我没改动,是默认的no显示全部
回复 4# q5894201


    我贴错了,那项我没改动,是默认的no收起
系统集成 · 2013-10-16
浏览1885
airbenderairbender系统工程师太极
已解决已解决显示全部
已解决已解决收起
系统集成 · 2013-10-16
浏览2532
airbenderairbender系统工程师太极
还是hacmp6.1补丁问题,我上次查询的时候没有把sp12必须补丁勾上,提示说补丁已包含在在系统中;再次提醒朋友们:hacmp6.1安装后一定要打补丁;不然各种奇怪的问题会出现;显示全部
还是hacmp6.1补丁问题,我上次查询的时候没有把sp12必须补丁勾上,提示说补丁已包含在在系统中;
再次提醒朋友们:hacmp6.1安装后一定要打补丁;不然各种奇怪的问题会出现;收起
系统集成 · 2013-10-16
浏览2431
abit2007abit2007系统工程师代维
楼主还没来结帖地,坐等结帖。显示全部
楼主还没来结帖地,坐等结帖。收起
互联网服务 · 2013-10-16
浏览2458
arthuryiarthuryi系统工程师DXC
i do not know this显示全部
i do not know this收起
互联网服务 · 2013-10-14
浏览2454

提问者

airbender
系统工程师太极
擅长领域: 服务器存储Power服务器

相关问题

相关资料

相关文章

问题状态

  • 发布时间:2013-10-13
  • 关注会员:1 人
  • 问题浏览:11601
  • 最近回答:2013-10-17
  • X社区推广