刘胜涛
作者刘胜涛·2014-03-03 15:05
数据库管理员·Volkswagen Group China

DB2+TSA打补丁成功解决备机TSA的IBM.RecoveryRM故障

字数 9617阅读 2254评论 0赞 1
tsa出故障了,感觉这东西比较脆弱,备机不能正常查看lssam命令,报告IBM.RecoveryRM为启动。tivoli的马来西亚L2支持 用了N种办法搞不定都,最后请教L3,L3给我建议是重建domain,考虑再三决定升级Db2与TSA补丁,以下是我打补丁的步骤。

1. Backup dbm and db cfg info  
For each instance, login as instance owner, issue the following  command                                               
$> db2 get dbm cfg > <instance name>_cfg.txt 
For each database, issue the command                                                          
$> db2 get db  cfg for <db name> > <instancename>_<db_name>_cfg.txt    
2. Backup database package info
For each instance, login as instance owner, issue the following command   
$> db2 connect to <db name>
$> db2 "list packages for all show detail" > <instance name>_<db_name>_pkg.txt  
 db2  "terminate" 

3.Backup Database DDL

For each instance, login as instance owner,for each database,issue the following command      
#> db2look -d <db name> -e -l -x -o  <instance name>_<db name>_ddl.sql   
4. Stop the WAS application
5. Copy the fix pack to a temporary directory and uncompress it
#> tar -xvf v9.7fp8_linuxx64_server.tar.gz 
#> tar zxvf v9.7fp8_linuxx64_nlpack.tar.gz
#> tar -xvf 3.2.2-TIV-SAMP-Linux-FP0007.tar  
6. Check the DB2 instance and database information
#> db2level
#> db2ilist
 #> db2 list database directory
7. Stop the DB2 instance and export the tsa sampolicy config info at server db2prd01 and stop tsa 
    for instance db2inst1, login as root 
   #> rgreq -o stop <db2 resource group>
   #> rgreq -o stop <db2 cluse node>
   #> stoprpdomain hadr_domain
   for instance db2inst2,login as instance owner 
   #> db2 force applications all
   #> db2 terminate
   #> db2stop
8. Take snapshots to backup all database server
9. Install DB2 Fix Pack  at server db2prd01 and server db2prd02 
   login as root, change to the directory that contians the fix pack image
  #> ./installFixPack -b  /opt/ibm/db2/V9.7
10. Install TSA Fix Pack  at server db2prd01 and server db2prd02
    login as root, change to the directory that contians the fix pack image
     #> ./installSAM -b /opt/ibm/db2/V9.7
11. Update instance to use the new DB2 level  at server db2prd01 and db2prd02
     For each instance, issue the following command
     #> db2iupdt <instance name>
     Update DB2 administrator Server
     #> dasupdt
12. Update the system catalog objects in Datbases to support the fix pack  at server db2prd01
     For each instance, login in as the instance owner
     For each database, issue the command
     #> db2updv97 -d <DB name>
13. Restart DB2 instance and DAS and TSA
     for instance instance1,login as root at server db2prd01                             
     #> startrpdomain hadr_domain
     #> rgreq -o start <DB2 cluster node>
     #> rgreq -o start <DB2 resource group>
     for instance instance2,login as instance owner                             
     #> db2start
     login as administrator Server Owner
     #> db2admin start
14. Check the Database version  at server db2prd01
     For each instance and each database
     #> db2 connect to <Database Name>
     #> db2 "select * from sysibm.sysversions"
15. Check the tsa version   at server db2prd01 and   at server db2prd02
     #>samversion
     #>lsrpdomain
     #>lsrpnode
     #>lsrpnode
16. Check the DB2 log
17. Check the tsa status
     #> lssam 
18. Start WAS applications       
19. Check WAS applications 
补丁打完后发现有的包没有绑定成功,需要手工绑定。本来打算使用

db2rbind dbname -l logfile all ,执行时间较长,发现在运行时我连接数据库创建表失败,检查db2pd -d XXX -wlock 如下

Locks being waited on :
AppHandl [nod-index] TranHdl    Lockname                   Type       Mode Conv Sts CoorEDU    AppName  AuthID   AppID                          
366      [000-00366] 51         04000C00050000000000000052 Row        ..X       G   326        db2jcc_a DB2INST1 10.120.16.42.56246.140112080349 
367      [000-00367] 53         04000C00050000000000000052 Row        ..X       W   327        db2jcc_a DB2INST1 10.120.16.41.35929.140112080350 

361      [000-00361] 60         5359534C564C3031DDECEF2841 Internal P ..S       G   323        db2jcc_a DB2INST1 10.120.16.41.35864.140112080340 

363      [000-00363] 58         5359534C564C3031DDECEF2841 Internal P ..S       G   325        db2jcc_a DB2INST1 10.120.16.42.56179.140112080342 

362      [000-00362] 59         5359534C564C3031DDECEF2841 Internal P ..S       G   324        db2jcc_a DB2INST1 10.120.16.42.56178.140112080341 
7324     [000-07324] 48         5359534C564C3031DDECEF2841 Internal P ..S  ..X  C   20         db2rbind DB2INST1 *LOCAL.db2inst1.140113023130    
7382     [000-07382] 52         5359534C564C3031DDECEF2841 Internal P ..S       W   232        DB2ATS   DB2INST1 *LOCAL.db2inst1.140113023651    
372      [000-00372] 56         5359534C564C3031DDECEF2841 Internal P ..S       W   330        db2jcc_a DB2INST1 10.120.16.42.56275.140112080539  ,

     此时db2diag.log

2014-01-13-10.50.22.083449+480 E439412E582         LEVEL: Warning
PID     : 4965                 TID  : 46912753362688PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000          DB   : ITIM
EDUID   : 60                   EDUNAME: db2dlock (ITIM) 0
FUNCTION: DB2 UDB, lock manager, sqlpldl, probe:1280
MESSAGE : ADM1838W  An application is waiting for a lock held by an indoubt
          transaction.  This will cause the application to wait indefinitely. 
          Use the LIST INDOUBT TRANSACTIONS command to investigate and resolve
          the indoubt transactions.

我这是production环境,Ctrl+C掉db2rbind进程,force相应的application,此时db2pd -d XXX -wlock,lock wait 消失,使用如下命令查找绑定异常的package

db2 "SELECT 'db2 rebind package '||trim(pkgschema)||'.'||tRIM(pkgname)||' resolve any'  FROM syscat.packages WHERE  VALID !='Y'" > tmp.sh

执行sh即可,不用全部绑定

如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!

1

添加新评论0 条评论

Ctrl+Enter 发表

作者其他文章

相关文章

相关问题

相关资料

X社区推广