OS:AIX 6.1
DB2:9.7.0
日志报类似错误LOCK TIMEOUT:
2012-03-07-11.45.44.882200+480 E40328185A916 LEVEL: Warning
PID : 3735778 TID : 44231 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : XXXXXXX
APPHDL : 0-17838 APPID: 10.0.1.125.36149.120305223001
AUTHID : DB2INST1
EDUID : 44231 EDUNAME: db2agent (XXXXXX) 0
FUNCTION: DB2 UDB, database monitor, sqmLockEvents::collectLockEvent, probe:267
MESSAGE : ADM5506I "Lock timeout" event has occurred on lock
"00020F00000000000022007D52" at timestamp
"2012-03-07-11.45.44.882004" with event ID "2445". The affected
application is named "db2jcc_application", and is associated with the
workload name "SYSDEFAULTUSERWORKLOAD" and application ID
"10.0.1.125.36149.120305223001" at member "0". The role that this
application plays with respect to this lock is: "Requestor".
2012-03-07-11.45.44.882798+480 E40329102A912 LEVEL: Warning
PID : 3735778 TID : 44231 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : XXXXXXX
APPHDL : 0-17838 APPID: 10.0.1.125.36149.120305223001
AUTHID : DB2INST1
EDUID : 44231 EDUNAME: db2agent (XXXXXXX) 0
FUNCTION: DB2 UDB, database monitor, sqmLockEvents::collectLockEvent, probe:267
MESSAGE : ADM5506I "Lock timeout" event has occurred on lock
"00020F00000000000022007D52" at timestamp
"2012-03-07-11.45.44.882004" with event ID "2445". The affected
application is named "db2jcc_application", and is associated with the
workload name "SYSDEFAULTUSERWORKLOAD" and application ID
"10.0.1.125.38181.120302210003" at member "0". The role that this
application plays with respect to this lock is: "Owner".
因为这个TIMEOUT是有规律的出现,所以比较容易通过DB2PD捕捉到相关的信息如下:
命令如下:
db2pd -d XXXXXX -locks showlocks wait -tra -app -dyn >locks.out
跟锁相关的信息如下:
Database Partition 0 -- Database XXXXXXX -- Active -- Up 16 days 07:49:53
Locks:
Address TranHdl Lockname Type Mode Sts Owner Dur HoldCount Att ReleaseFlg rrIID
0x07000000D294F480 14 00020F00000000000022007D52 Row ..U W 159 1 0 0x00000000 0x00000020 0
TbspaceID 2 TableID 3840 PartitionID 0 Page 34 Slot 125
0x07000000F2409C80 159 00020F00000000000022007D52 Row ..U G 159 1 0 0x00000000 0x00000020 0
TbspaceID 2 TableID 3840 PartitionID 0 Page 34 Slot 125
Database Partition 0 -- Database XXXXXXX -- Active -- Up 16 days 07:49:53
Transactions:
Address AppHandl [nod-index] TranHdl Locks State Tflag Tflag2 Firstlsn Lastlsn LogSpace SpaceReserved TID AxRegCnt GXID ClientUserID ClientWrkstnName ClientApplName ClientAccntng
0x07000000206FD280 27296 [000-27296] 14 4 READ 0x00000000 0x00000000 0x0000000000000000 0x0000000000000000 0 0 0x000025E826E9
0x07000000E9A4D180 55995 [000-55995] 159 4 READ 0x00000000 0x00000000 0x0000000000000000 0x0000000000000000 0 0 0x000025A4C064 1 0 n/a ec-app-srv5 n/a n/a
Database Partition 0 -- Database XXXXXXXX -- Active -- Up 16 days 07:49:53
Applications:
Address AppHandl [nod-index] NumAgents CoorEDUID Status C-AnchID C-StmtUID L-AnchID L-StmtUID Appid WorkloadID WorkloadOccID
0x0780000000ED0080 27296 [000-27296] 1 21065 Lock-wait 635 98 635 98 10.0.1.125.47681.120306230002 1 116317
0x07800000010B6600 55995 [000-55995] 1 5142 UOW-Waiting 0 0 961 146 10.0.1.125.38181.120302210003 1 90337
External Connection Attributes
Address AppHandl [nod-index] ClientIPAddress EncryptionLvl SystemAuthID
0x0780000000ED0080 27296 [000-27296] 10.0.1.125 None DB2INST1
0x07800000010B6600 55995 [000-55995] 10.0.1.125 None DB2INST1
Trusted Connection Attributes
Address AppHandl [nod-index] TrustedContext ConnTrustType RoleInherited
0x0780000000ED0080 27296 [000-27296] n/a non trusted n/a
0x07800000010B6600 55995 [000-55995] n/a non trusted n/a
Database Partition 0 -- Database XXXXXXX -- Active -- Up 16 days 07:49:53
Dynamic Cache:
Current Memory Used 65689615
Total Heap Size 65691566
Cache Overflow Flag 0
Number of References 964816319
Number of Statement Inserts 369556
Number of Statement Deletes 365186
Number of Variation Inserts 289460
Number of Statements 4370
Dynamic SQL Statements:
Address AnchID StmtUID NumEnv NumVar NumRef NumExe Text
0x0700000071C0F000 635 98 1 1 2946 2946 SELECT T1.MSGID,
T1.MESSAGEINDEX, T1.TRANSPORT_ID, T1.RETRIES, T1.STOREENT_ID, T1.EXPIRY, T1.OPTCOUNTER, T1.MESSAGE FROM MSGSTORE T1 WHERE
MSGID = ? AND RETRIES = ? FOR UPDATE WITH RS
0x070000006086FDE0 961 146 1 1 369664 369664 SELECT T1.LANGUAGE_ID,
T1.COUNTRYABBR, T1.NAME, T1.OPTCOUNTER, T1.CALLINGCODE FROM COUNTRY T1 WHERE (LANGUAGE_ID = ?) AND ((NAME = ?) OR
(COUNTRYABBR = ?))
Dynamic SQL Environments:
Address AnchID StmtUID EnvID Iso QOpt Blk
0x0700000061A99CA0 635 98 10 CS 5 B
0x07000000E52CF660 961 146 2 CS 5 B
Dynamic SQL Variations:
Address AnchID StmtUID EnvID VarID NumRef Typ Lockname Val Insert Time Sect Size
0x07000000614F5160 635 98 10 1 55 3 000000620000000A00014F6056 Y 2012-03-07-04.35.01.422227 8016
0x07000000E52DA2A0 961 146 2 1 36005 6 00000092000000020001782056 Y 2012-03-06-18.40.32.715567 10872
由上可以看到是AppHandl 55995持有资源,没有释放,AppHandl 27296没办得到资源的U锁而TIMEOUT。
问题如下:
今天观察了一天,一直都是55995持有U锁资源,但它一直都是UOW-Waiting状态。我原来认为,这个APP执行了一些操作,持有了MSGSTORE某些记录的U锁,但之后它一直没有COMMIT或ROLLBACK,所以一直持有这个U锁。后来我通过,db2 get snapshot for applications on trendyec >app.out,查看55995这个APP的相关信息如下 :
Application handle = 55995
Application status = UOW Waiting
Status change time = 03/03/2012 14:29:59.852275
Application code page = 1208
Application country/region code = 0
DUOW correlation token = 10.0.1.125.38181.120302210003
Application name = db2jcc_application
Application ID = 10.0.1.125.38181.120302210003
Sequence number = 01052
TP Monitor client user ID =
TP Monitor client workstation name = ec-app-srv5
TP Monitor client application name =
TP Monitor client accounting string =
Connection request start timestamp = 03/03/2012 05:00:00.067114
Connect request completion timestamp = 03/03/2012 05:00:00.076829
Application idle time = 4 days 1 hour 49 minutes 17 seconds
CONNECT Authorization ID = DB2INST1
……
Previous UOW completion timestamp = 03/03/2012 14:29:59.838759
Elapsed time of last completed uow (sec.ms)= 0.002753
UOW start timestamp = 03/03/2012 14:29:59.839205
……
Most recent operation = Close
Cursor name = SQL_CURSH201C60
Most recent operation start timestamp = 03/03/2012 14:29:59.852194
Most recent operation stop timestamp = 03/03/2012 14:29:59.852274
可以看到55995早在03/03/2012 14:29:59.852274执行过操作就已完成了。
但它为什么还会持有锁呢?
哪个大虾可以帮我解答一下啊。是不是我自己理解错了
收起