AIX RAC其中一个实例的listener异常处理

故障现象:
现象一:crs_stat -t观察RAC数据库其中一个实例的listener无法打开
$ crs_stat -t
Name           Type           Target    State     Host      
------------------------------------------------------------
ora....0A.lsnr application    ONLINE    OFFLINE         
ora.p670a.gsd  application    ONLINE    ONLINE    p670a         
ora.p670a.ons  application    ONLINE    ONLINE    p670a      
ora.p670a.vip  application    ONLINE    ONLINE    p670a      
ora....0B.lsnr application    ONLINE    ONLINE    p670b      
ora.p670b.gsd  application    ONLINE    ONLINE    p670b      
ora.p670b.ons  application    ONLINE    ONLINE    p670b      
ora.p670b.vip  application    ONLINE    ONLINE    p670b      
ora.zhjport.db application    ONLINE    ONLINE    p670b      
ora....port.cs application    ONLINE    ONLINE    p670b      
ora....rt1.srv application    ONLINE    ONLINE    p670a      
ora....rt2.srv application    ONLINE    ONLINE    p670b      
ora....t1.inst application    ONLINE    ONLINE    p670a      
ora....t2.inst application    ONLINE    ONLINE    p670b

现象二:查实例ALERT日志发现实例正常OPEN,本地可以查询,断定是listener监听的问题
Wed Dec  8 08:43:29 2010
Completed: ALTER DATABASE OPEN

处理过程:
一、重启监听
$ lsnrctl stop
……

$ lsnrctl start

LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Production on 08-DEC-2010 12:25:19

Copyright (c) 1991, 2006, Oracle.  All rights reserved.

Starting /home/oracle/database/bin/tnslsnr: please wait...

TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Production
System parameter file is /home/oracle/database/network/admin/listener.ora
Log messages written to /home/oracle/database/network/log/listener.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=p670a)(PORT=1521)))

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Production
Start Date                08-DEC-2010 12:25:19
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      ON
Listener Parameter File   /home/oracle/database/network/admin/listener.ora
Listener Log File         /home/oracle/database/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=p670a)(PORT=1521)))
The listener supports no services
The command completed successfully

发现问题依旧
$ crs_stat -t
Name           Type           Target    State     Host      
------------------------------------------------------------
ora....0A.lsnr application    ONLINE    OFFLINE         
ora.p670a.gsd  application    ONLINE    ONLINE    p670a         
ora.p670a.ons  application    ONLINE    ONLINE    p670a      
ora.p670a.vip  application    ONLINE    ONLINE    p670a      
ora....0B.lsnr application    ONLINE    ONLINE    p670b      
ora.p670b.gsd  application    ONLINE    ONLINE    p670b      
ora.p670b.ons  application    ONLINE    ONLINE    p670b      
ora.p670b.vip  application    ONLINE    ONLINE    p670b      
ora.zhjport.db application    ONLINE    ONLINE    p670b      
ora....port.cs application    ONLINE    ONLINE    p670b      
ora....rt1.srv application    ONLINE    ONLINE    p670a      
ora....rt2.srv application    ONLINE    ONLINE    p670b      
ora....t1.inst application    ONLINE    ONLINE    p670a      
ora....t2.inst application    ONLINE    ONLINE    p670b

二、确认/home/oracle/database/network/admin/listener.ora文件
1、对比67、56两个节点的lsnrctl status
---------------------------------------67------------------------------------------
$ lsnrctl status

LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Production on 08-DEC-2010 12:27:39

Copyright (c) 1991, 2006, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Production
Start Date                08-DEC-2010 12:25:19
Uptime                    0 days 0 hr. 2 min. 19 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      ON
Listener Parameter File   /home/oracle/database/network/admin/listener.ora
Listener Log File         /home/oracle/database/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=p670a)(PORT=1521)))
Services Summary...
Service "zhjport" has 2 instance(s).
  Instance "zhjport1", status READY, has 2 handler(s) for this service...
  Instance "zhjport2", status READY, has 1 handler(s) for this service...
Service "zhjportXDB" has 2 instance(s).
  Instance "zhjport1", status READY, has 1 handler(s) for this service...
  Instance "zhjport2", status READY, has 1 handler(s) for this service...
Service "zhjport_XPT" has 2 instance(s).
  Instance "zhjport1", status READY, has 2 handler(s) for this service...
  Instance "zhjport2", status READY, has 1 handler(s) for this service...
The command completed successfully

---------------------------------------56------------------------------------------
$ lsnrctl status

LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Production on 08-DEC-2010 13:18:51

Copyright (c) 1991, 2006, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_P670B
Version                   TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Production
Start Date                08-DEC-2010 09:08:25
Uptime                    0 days 4 hr. 10 min. 25 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      ON
Listener Parameter File   /home/oracle/database/network/admin/listener.ora
Listener Log File         /home/oracle/database/network/log/listener_p670b.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.58)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.56)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
Services Summary...
Service "zhjport" has 2 instance(s).
  Instance "zhjport1", status READY, has 1 handler(s) for this service...
  Instance "zhjport2", status READY, has 2 handler(s) for this service...
Service "zhjportXDB" has 2 instance(s).
  Instance "zhjport1", status READY, has 1 handler(s) for this service...
  Instance "zhjport2", status READY, has 1 handler(s) for this service...
Service "zhjport_XPT" has 2 instance(s).
  Instance "zhjport1", status READY, has 1 handler(s) for this service...
  Instance "zhjport2", status READY, has 2 handler(s) for this service...
The command completed successfully
$

2、检查发现Listening Endpoints Summary...存在差异
---------------------------------------67------------------------------------------
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=p670a)(PORT=1521)))
---------------------------------------56------------------------------------------
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.58)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.56)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))

3、对比67、56两个节点的listener.ora
---------------------------------------67------------------------------------------
p670a:/home/oracle/database/network/admin#more listener.ora
�? listener.ora.p670a Network Configuration File: /home/oracle/database/network/admin/listener.ora.p670a
# Generated by Oracle configuration tools.

SID_LIST_LISTENER_P670A =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /home/oracle/database)
      (PROGRAM = extproc)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = zhjport1)
      (ORACLE_HOME = /home/oracle/database)
      (SID_NAME = zhjport1)
    )
  )

LISTENER_P670A =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = p670a_vip)(PORT = 1521))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.67)(PORT = 1521))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
      )
    )
  )

INBOUND_CONNECT_TIMEOUT_LISTENER_P670A =0
SUBSCRIBE_FOR_NODE_DOWN_EVENT_LISTENER_P670A=OFF

---------------------------------------56------------------------------------------
p670b:/home/oracle/database/network/admin#more listener.ora

# listener.ora.p670b Network Configuration File: /home/oracle/database/network/admin/listener.ora.p670b
# Generated by Oracle configuration tools.

SID_LIST_LISTENER_P67B =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /home/oracle/database)
      (PROGRAM = extproc)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = zhjport2)
      (ORACLE_HOME = /home/oracle/database)
      (SID_NAME = zhjport2)
    )
  )


LISTENER_P670B =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = p670b_vip)(PORT = 1521)(IP = FIRST))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.56)(PORT = 1521)(IP = FIRST))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
      )
    )
  )

INBOUND_CONNECT_TIMEOUT_LISTENER_P670B =0
SUBSCRIBE_FOR_NODE_DOWN_EVENT_LISTENER_P670A=OFF

3、对比67、56两个节点的listener.ora
---------------------------------------67------------------------------------------
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = p670a_vip)(PORT = 1521))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.67)(PORT = 1521))
      )
   SUBSCRIBE_FOR_NODE_DOWN_EVENT_LISTENER_P670A=OFF
---------------------------------------56------------------------------------------
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = p670b_vip)(PORT = 1521)(IP = FIRST))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.56)(PORT = 1521)(IP = FIRST))
      )
  SUBSCRIBE_FOR_NODE_DOWN_EVENT_LISTENER_P670A=OFF

4、修正后的67的listener.ora
SID_LIST_LISTENER_P67A =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /home/oracle/database)
      (PROGRAM = extproc)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = zhjport1)
      (ORACLE_HOME = /home/oracle/database)
      (SID_NAME = zhjport1)
    )
  )


LISTENER_p670a =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = p670a_vip)(PORT = 1521)(IP = FIRST))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.67)(PORT = 1521)(IP = FIRST))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
      )
    )
  )

INBOUND_CONNECT_TIMEOUT_LISTENER_p670a =0
SUBSCRIBE_FOR_NODE_DOWN_EVENT_LISTENER_P670B=OFF

5、重启CRS(root下)
p670a:/#init.crs stop
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources. This could take several minutes.
Successfully stopped CRS resources.
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
Shutdown has begun. The daemons should exit soon.

p670a:/#init.crs start
Startup will be queued to init within 30 seconds.

6、观察CRS状态正常
$ crs_stat -t
Name           Type           Target    State     Host      
------------------------------------------------------------
ora....0A.lsnr application    ONLINE    ONLINE    p670a      
ora.p670a.gsd  application    ONLINE    ONLINE    p670a      
ora.p670a.ons  application    ONLINE    ONLINE    p670a      
ora.p670a.vip  application    ONLINE    ONLINE    p670a      
ora....0B.lsnr application    ONLINE    ONLINE    p670b      
ora.p670b.gsd  application    ONLINE    ONLINE    p670b      
ora.p670b.ons  application    ONLINE    ONLINE    p670b      
ora.p670b.vip  application    ONLINE    ONLINE    p670b      
ora.zhjport.db application    ONLINE    ONLINE    p670b      
ora....port.cs application    ONLINE    ONLINE    p670b      
ora....rt1.srv application    ONLINE    ONLINE    p670a      
ora....rt2.srv application    ONLINE    ONLINE    p670b      
ora....t1.inst application    ONLINE    ONLINE    p670a      
ora....t2.inst application    ONLINE    ONLINE    p670b

7、客户端SQLPLUS连接正常


问题根源分析:RAC中监听配置需要IP=FIRST
RAC自动配置的监听器(listener)一般都可以看到使用了IP=FIRST选项,以使得监听在所给出主机名的相关ip端点。默认情况下不使用IP=FIRST选项,监听器总是在所有该主机的网络接口上监听(listen)。从8i开始,监听器在绑定Ip地址的方式已经改变。一般的规则是“在指定主机上监听所有端口(listen on all interfaces if a hostname is specified)。然而这仅在大多数情况下准确,在某些特定情况监听器可能并非如此表现。你可能需要强制监听器仅绑定到特定的IP地址(即便在指定主机名的情况下),通过在监听配置文件LISTENER.ORA中配置IP=FIRST语句。

由于监听绑定端点对于数据库实例注册到监听尤为重要,所以我们有必要了解监听器具体如何决定绑定到TCP端点(endpoint)的算法。

常规算法
你可以尝试使用以下模式解释监听器如何绑定到TCP协议地址的表现:
(1) 如果你提供一个IP 地址,则监听器始终在该IP地址上监听。
(2) 若你提供一个主机名:
(a) Oracle 对 gethostbyname()函数 输入该主机名,可能返回多个IP地址。
gethostname 库函数会查询DNS服务器,/etc/hosts配置文件,和NIS服务以及其他方式,基于系统实际配置。具体如何工作依赖于操作系统类型,但一般来说/etc/host.conf,/etc/nsswitch.conf 和/etc/resolv.conf配置文件在UNIX平台上指导过程。’
你可以同过ping工具找出相关的IP(如例:ping 活着 ping -s 即可以看到主机名相关的IP)。 不要使用host,nslookup 或dig 工具,以上工具仅查询DNS服务可能返回错误结果。
(b) Oracle 将使用 gethostname函数得出当地系统配置的主机名。
gethostname() 库函数将返回本地系统的标准主机名。
你也可以通过hostname命令得到该值。
(c) Oracle 对 gethostbyname() 函数输入之前一步2b中得到的主机名。
(d)  Oracle 比较 2a 中与 2c中返回的IP, 若没有匹配的IP项,则监听器绑定到2a中返回的ip地址上。
或者
(e) 若有匹配项存在,则监听器绑定至所有工作的网络接口上。
你可以通过lsnrctl命令的输出非常容易地判断监听器是如何绑定到网络接口的,如下例:
$ lsnrctl status
LSNRCTL for Linux: Version 10.2.0.1.0 – Production on 07-MAY-2007 15:29:48
Copyright
(c) 1991, 2005, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1)))
STATUS
of the LISTENER
————————
Alias                     LISTENER
Version                   TNSLSNR for Linux:
Version 10.2.0.1.0 – Production
Start Date                13-APR-2007 14:11:16
Uptime                    24 days 1 hr. 18
min. 32 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP
OFF
Listener Parameter File   /opt/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /opt/oracle/product/10.2.0/db_1/network/log/listener.log
Listening
Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=1527)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=myhost.ro.oracle.com)(PORT=1521)))
在以上情况中,监听器绑定值指定IP10.10.10.10的1527端口上同时也在所有接口的1521端口上。
由于RAC情况中对于监听器监听范围的要求,故在给出主机名的情况下配置工具netca等,总是会在配置文件中加上IP=FIRST语句保证监听器绑定到2a中返回的IP地址而非所有网络接口。
参与0

0同行回答

“答”则兼济天下,请您为题主分忧!

提问者

yinxin
系统管理员银信长远
擅长领域: 服务器存储虚拟化

相关问题

相关资料

相关文章

问题状态

  • 发布时间:2010-12-14
  • 关注会员:0 人
  • 问题浏览:2605
  • X社区推广