生产环境锁等待问题定位。信贷管理两个月发生了两次锁等待,时间较长。
DB2 version:9.7.5
相关表查询分四个频度:
日,查询一天DB2ADMIN.ftp_res_acct_REAL 的数据。
月,是查询DB2ADMIN.ftp_res_acct_REAL_his 月末的数据。
季;是查询DB2ADMIN.ftp_res_acct_REAL_his 三个月的数据。
年;是查询DB2ADMIN.ftp_res_acct_REAL_his 一年的数据。数据是按照分区查询的,每天100w。
锁等待业务逻辑如下:
有3个SQL(具体如附所示),第一个SQL每天都会执行(6个关联表),第二个SQL月末执行完后(add partition),执行第三个SQL导入数据(insert into tablename select from)。当执行执行第3个SQL时,等待第二个SQL,第二个SQL又等待第一个SQL(锁等待一般发生在月末)。
(1) DB2ADMIN.ftp_res_acct_REAL表为月表,每天会增加表分区;生产中有700多个分区,由于历史原因,没有detach以前的表分区。怀疑分区较多导致创建分区时间较长,测试机测试(开始时数据量较少),执行第二个SQL语句(增加表分区),统计时间如下:
* Timestamp: Fri Dec 13 2013 11:24:10 CST ---------------------------------------------
* SQL Statement Number 1:
ALTER TABLE ftp_res_acct_real ADD PARTITION 20131202 starting from '20131202' INCLUSIVE ending at '20131202' INCLUSIVE;
* Elapsed Time is: 0.033721 seconds
* Summary Table:
Type Number Repetitions Total Time (s) Min Time (s) Max Time (s) Arithmetic Mean Geometric Mean Row(s) Fetched Row(s) Output --------- ----------- ----------- -------------- -------------- -------------- --------------- -------------- -------------- ------------- Statement 1 1 0.033721 0.033721 0.033721 0.033721 0.033721 0 0
* Total Entries: 1 * Total Time: 0.033721 seconds * Minimum Time: 0.033721 seconds * Maximum Time: 0.033721 seconds * Arithmetic Mean Time: 0.033721 seconds * Geometric Mean Time: 0.033721 seconds |
Total Time: 0.033721 seconds,时间很短排除因表分区过多导致创建表分区时时间过长。
(2)测试机插入6天数据(磁盘空间不足)。重新模拟生产逻辑过程,依次执行3个SQL ,在第一个SQL和第2个SQL存在Internal V的锁等待。
[db2admin@TSTZCFZEQ01 test]$ db2pd -d ncdb -wlock
Database Partition 0 -- Database NCDB -- Active -- Up 0 days 02:47:02 -- Date 12/13/2013 13:44:09
Locks being waited on : AppHandl [nod-index] TranHdl Lockname Type Mode Conv Sts CoorEDU AppName AuthID AppID 1508 [000-01508] 13 03000000020000000100006B56 Internal V ..S G 174 db2bp DB2ADMIN *LOCAL.db2inst1.131213054040 1522 [000-01522] 16 03000000020000000100006B56 Internal V ..X W 194 db2batch DB2ADMIN *LOCAL.db2inst1.131213054356 |
因第1个sql语句执行时间过长导致,第2个sql处于锁等待,第3个sql又等第2个sql。所以其中精力关注第1个sql的执行效率。
(2.1)第一个SQL语句Runstats前
表统计信息如下:
[db2admin@TSTZCFZEQ01 test]$ db2 "select CARD,FPAGES,STATS_TIME from syscat.tables where tabname='FTP_RES_ACCT_REAL'" CARD FPAGES STATS_TIME -------------------- -------------------- -------------------------- 31952 3104 2013-12-13-11.11.23.246495
|
第二个SQL语句等待第一个SQL的时间
* SQL Statement Number 1:
ALTER TABLE ftp_res_acct_real ADD PARTITION 20131202 starting from '20131202' INCLUSIVE ending at '20131202' INCLUSIVE;
* Elapsed Time is: 53.639665 seconds
* Summary Table:
Type Number Repetitions Total Time (s) Min Time (s) Max Time (s) Arithmetic Mean Geometric Mean Row(s) Fetched Row(s) Output --------- ----------- ----------- -------------- -------------- -------------- --------------- -------------- -------------- ------------- Statement 1 1 53.639665 53.639665 53.639665 53.639665 53.639665 0 0
* Total Entries: 1 * Total Time: 53.639665 seconds * Minimum Time: 53.639665 seconds * Maximum Time: 53.639665 seconds * Arithmetic Mean Time: 53.639665 seconds * Geometric Mean Time: 53.639665 seconds |
(2.2)Runstats后再测试一次。
db2 "select CARD,FPAGES,STATS_TIME from syscat.tables where tabname='FTP_RES_ACCT_REAL'"
CARD FPAGES STATS_TIME -------------------- -------------------- -------------------------- 8066044 426534 2013-12-13-14.56.37.159249
1 record(s) selected. |
第二个SQL语句等待第一个SQL的时间
db2 "select CARD,FPAGES,STATS_TIME from syscat.tables where tabname='FTP_RES_ACCT_REAL'" CARD FPAGES STATS_TIME -------------------- -------------------- -------------------------- 8066044 426534 2013-12-13-14.56.37.159249 1 record(s) selected. * Timestamp: Fri Dec 13 2013 14:59:11 CST --------------------------------------------- * SQL Statement Number 1:
ALTER TABLE ftp_res_acct_real ADD PARTITION 20131202 starting from '20131202' INCLUSIVE ending at '20131202' INCLUSIVE;
* Elapsed Time is: 48.092492 seconds
* Summary Table:
Type Number Repetitions Total Time (s) Min Time (s) Max Time (s) Arithmetic Mean Geometric Mean Row(s) Fetched Row(s) Output --------- ----------- ----------- -------------- -------------- -------------- --------------- -------------- -------------- ------------- Statement 1 1 48.092492 48.092492 48.092492 48.092492 48.092492 0 0
* Total Entries: 1 * Total Time: 48.092492 seconds * Minimum Time: 48.092492 seconds * Maximum Time: 48.092492 seconds * Arithmetic Mean Time: 48.092492 seconds * Geometric Mean Time: 48.092492 seconds |
通过上面实验,说明在没有索引情况下统计信息准确度对执行时间影响不是很大。
第一个sql导出访问计划:
Optimizer Plan:
Rows Operator (ID) Cost
0.1 n/a RETURN ( 1) 274256 | 0.1 n/a FILTER ( 2) 274256 | 1 n/a TBSCAN ( 3) 274256 | 1 n/a SORT ( 4) 274256 | 9.01708 n/a MSJOIN ( 5) 274256 /------/ ------------ 0.225427 * n/a | TBSCAN 1000 ( 6) n/a 273755 TBSCAN | (39) 0.225427 500.839 n/a | SORT 1000 ( 7) n/a 273755 SORT | (40) 0.225427 500.839 n/a | MSJOIN 1000 ( 8) n/a 273755 TBSCAN /-----------/ ----------------- (41) 0.225427 * 500.477 n/a | | TBSCAN 1000 1000 ( 9) n/a n/a 272979 TBSCAN Table: | (35) DB2ADMIN 0.225427 775.873 RM_COA n/a | SORT 1000 (10) n/a 272979 SORT | (36) 0.225427 775.873 n/a | NLJOIN 1000 (11) n/a 272979 TBSCAN /-------------------------/ --------------------- (37) 8318.03 2.7101e-05 775.511 n/a n/a | MSJOIN TBSCAN 1000 (12) (23) n/a 270495 2474.99 Table: /-------/ ------------- | DB2ADMIN 8318.03 * 0.0016 RM_CUSTOMER n/a | n/a TBSCAN 1000 TEMP (13) n/a (24) 270009 TBSCAN 2474.98 | (17) | 8318.03 409.083 0.0016 n/a | n/a SORT 1000 NLJOIN (14) n/a (25) 269873 SORT 2474.97 | (18) /----/ -- 8318.03 409.024 0.04 0.04 n/a | n/a n/a TBSCAN 1000 FETCH FILTER (15) n/a (26) (28) 268885 HSJOIN 50.017 2424.95 | (19) / | 8.06604e+06 408.501 1 1000 40 n/a /---/ -- n/a n/a n/a Table: 1000 1000 IXSCAN Table: HSJOIN DB2ADMIN n/a n/a (27) DB2ADMIN (29) FTP_RES_ACCT_REAL FETCH TBSCAN 25.0134 BD_CORP 2424.95 (20) (22) | / 257.873 150.434 0 40 40 / | Index: n/a n/a 1000 1000 1000 DB2ADMIN HSJOIN TBSCAN n/a n/a n/a I_BD_CORP_1 (30) (33) IXSCAN Table: Table: 1616.62 808.317 (21) DB2ADMIN DB2ADMIN / | 75.4251 BD_CURRTYPE RM_CURRENCY_MAPPING 1000 40 1000 | n/a n/a n/a 0 TBSCAN TBSCAN Table: Index: (31) (32) DB2ADMIN DB2ADMIN 808.212 808.317 BD_DEPTDOC I_BD_CURRTYPE_1 | | 1000 1000 n/a n/a Table: Table: DB2ADMIN DB2ADMIN BD_DEPTDOC BD_DEPTDOC |
cost 值为274256,并且有多次排序,通过db2advis工具推荐多个索引,其中创建一条索引,效率可提高90%以上:
CREATE INDEX "DB2ADMIN"."FTP_RES_ACCT_REAL_index" ON "DB2ADMIN"."FTP_RES_ACCT_REAL" ("DATA_DATE" ASC, "COA_CODE" ASC, "REAL_INCOME" ASC, "ITEM_ID" ASC, "CUST_CODE" ASC, "ISSUE_DT" ASC, "FINAL_INCOME" ASC, "FINAL_RATE" ASC, "CUST_MANAGER_CODE" ASC, "ACCT_NO" ASC, "PK_DEPTDOC" ASC, "CUR_RATE" ASC, "CUR_BAL" ASC, "CURRENCY" ASC, "ASST_LIAB" ASC, "ACCT_TERM" ASC) NOT PARTITIONED ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS |
执行runstats后导出执行计划
Operator (ID) Cost
2.39273 n/a RETURN ( 1) 2160.2 | 2.39273 n/a FILTER ( 2) 2160.2 | 23.9273 n/a TBSCAN ( 3) 2160.19 | 23.9273 n/a SORT ( 4) 2160.19 | 23.9273 n/a HSJOIN ( 5) 2160.15 /---------/ -------- 16.0988 975 n/a n/a HSJOIN TBSCAN ( 6) (31) 1934.64 225.433 /--------------/ -------------- | 16.0988 0 975 n/a n/a n/a NLJOIN TBSCAN Table: ( 7) (30) DB2ADMIN 1909.62 25.0124 RM_COA /--------------------------/ ----------------------- | 8334.51 0.00193158 0 n/a n/a n/a MSJOIN TBSCAN Table: ( 8) (19) DB2ADMIN 1853.78 25.1191 RM_CUSTOMER /---------/ -------------- | 8334.51 * 0.133333 n/a | n/a TBSCAN 180 TEMP ( 9) n/a (20) 1569.12 TBSCAN 25.1045 | (13) | 8334.51 283.134 0.133333 n/a | n/a SORT 180 NLJOIN (10) n/a (21) 1568.62 SORT 25.0969 | (14) /-----/ --- 8334.51 283.123 0.04 3.33333 n/a | n/a n/a IXSCAN 180 FETCH FILTER (11) n/a (22) (24) 1561.37 HSJOIN 25.0133 0.0828313 | (15) / | 5.61768e+06 283.062 1 3 10 Index: /---/ -- n/a n/a n/a DB2ADMIN 1000 180 IXSCAN Table: HSJOIN FTP_RES_ACCT_REAL_INDEX n/a n/a (23) DB2ADMIN (25) FETCH TBSCAN 0.00965094 BD_CORP 0.0817803 (16) (18) | /-/ - 257.873 25.0856 3 10 10 / | Index: n/a n/a 1000 1000 180 DB2ADMIN HSJOIN IXSCAN n/a n/a n/a I_BD_CORP_1 (26) (29) IXSCAN Table: Table: 0.0616182 0.0167536 (17) DB2ADMIN DB2ADMIN / | 75.4251 BD_CURRTYPE RM_CURRENCY_MAPPING 70 10 70 | n/a n/a Index: 0 IXSCAN IXSCAN DB2ADMIN Index: (27) (28) IDX1312130715550 DB2ADMIN 0.0378886 0.0167536 I_BD_CURRTYPE_1 | | 70 70 Index: Index: DB2ADMIN DB2ADMIN IDX1312130715520 IDX1312130715550 |
建立所有后cost为 2160.2,减少两个数量级。执行效率有很大的提高。
执行时间如下:
* SQL Statement Number 1:
ALTER TABLE ftp_res_acct_real ADD PARTITION 20131202 starting from '20131202' INCLUSIVE ending at '20131202' INCLUSIVE;
* Elapsed Time is: 0.440311 seconds
* Summary Table:
Type Number Repetitions Total Time (s) Min Time (s) Max Time (s) Arithmetic Mean Geometric Mean Row(s) Fetched Row(s) Output --------- ----------- ----------- -------------- -------------- -------------- --------------- -------------- -------------- ------------- Statement 1 1 0.440311 0.440311 0.440311 0.440311 0.440311 0 0
* Total Entries: 1 * Total Time: 0.440311 seconds * Minimum Time: 0.440311 seconds * Maximum Time: 0.440311 seconds * Arithmetic Mean Time: 0.440311 seconds * Geometric Mean Time: 0.440311 seconds |
总结:
(1)通过创建索引提高第一个SQL语句的查询效率,减少了对后续SQL的影响,降低锁等待的概率。尽可能避免运行时间长的事务,持有锁的时间越长,与其他应用程序发生锁争抢的可能性就越大。
(2)detach 掉多余的表分区(本案例中月表正常为30个表分区,实际为700个左右),过多的表分区会影响统计信息的准确性。
(3)可以通过create new_table,insert into new_table,attach new_table from ftp_res_acct_REAL 的方法 替换先add partition再 insert into ftp_res_acct_REAL,减少对ftp_res_acct_REAL的影响,回头会将此测试结果不上。
如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!
赞6
添加新评论5 条评论
2014-02-23 20:38
2014-02-13 18:40
2014-02-13 15:04
2014-01-18 11:23
2014-01-09 08:58