这一次,俺不打算再用案例了,而是描述一下如何通过我们可以使用的db2dart工具来dump磁盘的裸数据,进行data recovery。
通过db2dart,每一个用户都可以比较容易地了解到数据页,索引页,EMP页的格式(当然,对于LOB页的格式没有很好的方法去了解)。这样的话,当某些情况下数据页被损坏无法查询,db2dart /DDEL也只能dump那些好的页,用户则可以通过手工的方法查找出损坏页里面的部分数据。
在这里,俺假设读者对DB2已经有了基本的了解,对tablespace/container的概念,结构有所了解,能够熟练使用db2dart工具,对操作系统的文件系统,磁盘结构有所了解。
首先,让我们来看一看db2dart的几个特别的用法。
相信很多人都用过/DB /TS /T /DDEL /LHWM等参数。但是这里我们强调的是另外几个,包括/DD /DP /DI /DEMP。
篇幅所限,我们这里只讨论/DD。其余的同理……
/DD是Dump Data,也就是把一个数据页给dump出来。
Dump的时候有两种格式化的方法,一种是 /v y,也就是翻译成人们可以读懂的冬冬。另一种是 /v h,也就是直接把16进制数据给弄出来。通过细心地比较两者,用户可以大概猜出如何把16进制的裸数据与人们可以读懂的数据对应起来。
给大家一个例子:
(taoewang@bugdbug) /home/taoewang $ db2 "describe table tab2"
Column Type Type
name schema name Length Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
I SYSIBM INTEGER 4 0 Yes
J SYSIBM INTEGER 4 0 Yes
C SYSIBM CHARACTER 50 0 Yes
3 record(s) selected.
(taoewang@bugdbug) /home/taoewang $ db2 "select count(*) from tab2"
1
-----------
100000
1 record(s) selected.
(taoewang@bugdbug) /home/taoewang $ db2 "select TABLEID,TBSPACEID from syscat.tables where TABNAME='TAB2'"
TABLEID TBSPACEID
------- ---------
5 2
1 record(s) selected.
(taoewang@bugdbug) /home/taoewang/test $ db2 "select * from TAB2 fetch first 1 rows only"
I J C
----------- ----------- --------------------------------------------------
1 1 this is 1
1 record(s) selected.
我们有一个表,叫做tab2。其中有3个列,I,J,C,分别是int int char(50),都是可以被设置为NULL的。表里面存有100000行数据。表的tid是5,pid是2。
第一行的数据是 (1,1,"this is 1")
大家知道,表的头和一些控制信息是存储在page 0里面的,这样的话,我们可以用这个命令得到page 0的ascii的dump
db2dart ex5 /DD /TSI 2 /OI 5 /v y /PS 0 /NP 1 /RPTF tsi2oi5vyps0np1.out
然后用下面的命令得到raw dump
db2dart ex5 /DD /TSI 2 /OI 5 /v h /PS 0 /NP 1 /RPTF tsi2oi5vhps0np1.out
首先我们来看ascii dump,也就是/v y的输出:
______________________________________________________________________________
_______ DART _______
D a t a b a s e A n a l y s i s a n d R e p o r t i n g T o o l
IBM DB2 6000
______________________________________________________________________________
DART (V9.1) Report:
2009-11-13-14.02.19.721355
Database Name: EX5
Report name: tsi2oi5vyps0np1.out
Database Subdirectory: /home/taoewang/ex5/taoewang/NODE0000/SQL00001
Operational Mode: Database Inspection Only (INSPECT)
______________________________________________________________________________
------------------------------------------------------------------------------
Action option: DD
Table-object-ID: 5; Tablespace-ID: 2; First-page: 0; Number-pages: 1; Verbose: y
FYI: An active connection to the database has been detected.
False errors may be reported.
Deactivate all connections and re-run to verify.
Warning: The database state is not consistent.
Warning: Reorg rows MAY be due to the inconsistent state of the database.
Connecting to Buffer Pool Services...
Table object report phase start.
Dump format is verbose.
______________________________________
Page 0 of object 5 from table space 2.
BPS Page Header:
Page Data Offset = 48
Page Data Length = 4048
Page LSN = 0000 0330 33A5
Object Page Number = 0
Pool Page Number = 256
Object ID = 5
Object Type = Data Object
Data Page Header:
Slot Count = 44
Total Free Space = 36
Total Reserve Space = 0
Youngest Reserve Space = n/a
Youngest TID = n/a
Free Space Offset = 123
Maximum Record Size = 69
Data Records:
Slot 0:
Offset Location = 3996 (xF9C)
Record Length = 32 (x20)
Record Type = Data Object Header Control Record
Page count = 1787
Object Creation LSN = 0000 0106 AE3E
Object State = x0000
UDI Since Runstats = 99958
DFH flag bits = x00000000
Slot 1:
Offset Location = 2992 (xBB0)
Record Length = 1004 (x3EC)
Record Type = Free Space Control Record
Free space entries:
0: 36 (x0024), 52 (x0034), 52 (x0034), 52 (x0034)
4: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
8: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
12: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
16: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
20: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
24: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
28: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
32: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
36: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
40: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
44: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
48: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
….
476: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
480: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
484: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
488: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
492: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
496: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
Slot 2:
Offset Location = 2916 (xB64)
Record Length = 76 (x4C)
Record Type = Table Directory Record
TDIR version = 2
MetaIndex Root Page = 321
Index Type = 2
Max Insert Search = 0
Dictionary RID -- Page 0 Slot 0
Mapping Directory RID -- Page 0 Slot 0
Current table row format version = 0
Flags = x0000000106000200
bit representation = 00000000 00000000 00000000 00000001
00000110 00000000 00000010 00000000
Check pending info:
Constraint status = x00
Constraint RID page = 0
last BID page = x00000000
Slot 3:
Offset Location = 2884 (xB44)
Record Length = 32 (x20)
Record Type = Table Description Record
Number of Columns = 3
Column 1:
Type is Long Integer
Length = 4
Allows NULLs
Prohibits Default
Fixed offset: 0
Column 2:
Type is Long Integer
Length = 4
Allows NULLs
Prohibits Default
Fixed offset: 5
Column 3:
Type is Fixed Length Character String
Length = 50
Allows NULLs
Prohibits Default
Fixed offset: 10
Slot 4:
Offset Location = 2815 (xAFF)
Record Length = 69 (x45)
Record Type = Table Data Record (FIXEDVAR)
Record Flags = 0
Fixed part length value = 61
Column 1:
Fixed offset: 0
Type is Long Integer
Value = 1
Column 2:
Fixed offset: 5
Type is Long Integer
Value = 1
Column 3:
Fixed offset: 10
Type is Fixed Length Character String
74686973 20697320 31202020 20202020 this is 1
20202020 20202020 20202020 20202020
20202020 20202020 20202020 20202020
2020
Slot 5:
Offset Location = 2746 (xABA)
Record Length = 69 (x45)
Record Type = Table Data Record (FIXEDVAR)
Record Flags = 0
Fixed part length value = 61
Column 1:
Fixed offset: 0
Type is Long Integer
Value = 2
Column 2:
Fixed offset: 5
Type is Long Integer
Value = 2
Column 3:
Fixed offset: 10
Type is Fixed Length Character String
74686973 20697320 32202020 20202020 this is 2
20202020 20202020 20202020 20202020
20202020 20202020 20202020 20202020
2020
Slot 6:
Offset Location = 2677 (xA75)
Record Length = 69 (x45)
Record Type = Table Data Record (FIXEDVAR)
Record Flags = 0
Fixed part length value = 61
Column 1:
Fixed offset: 0
Type is Long Integer
Value = 3
Column 2:
Fixed offset: 5
Type is Long Integer
Value = 3
Column 3:
Fixed offset: 10
Type is Fixed Length Character String
74686973 20697320 33202020 20202020 this is 3
20202020 20202020 20202020 20202020
20202020 20202020 20202020 20202020
2020
….
上面的信息描述了TAB2表的第一个page。
Table-object-ID: 5; Tablespace-ID: 2; First-page: 0; Number-pages: 1; Verbose: y
然后下面有一条
FYI: An active connection to the database has been detected.
False errors may be reported.
Deactivate all connections and re-run to verify.
这个是说明,当我们在做db2dart的时候,数据库还存在连接。
一般来说,db2dart需要数据库offline,因为只有在offline的时候,我们才能够保证磁盘上的数据是真实的。否则有可能一个数据页在内存中被修改了但是还没有flush到磁盘上。不过在这个测试里面,我们不会对数据进行任何操作,因此磁盘上的数据不会被改变。
然后下面的部分就是BPS Page Header。这一部分描述了页的头,具体的数据结构我不想多说,但是用户可以从db2dart的输出里面看到很多咚咚了。
譬如在这个例子里面,我们有
BPS Page Header:
Page Data Offset = 48
Page Data Length = 4048
Page LSN = 0000 0330 33A5
Object Page Number = 0
Pool Page Number = 256
Object ID = 5
Object Type = Data Object
Data Page Header:
Slot Count = 44
Total Free Space = 36
Total Reserve Space = 0
Youngest Reserve Space = n/a
Youngest TID = n/a
Free Space Offset = 123
Maximum Record Size = 69
别忘了对比/v h的输出
Page 0 of object 5 from table space 2.
0000 *00300FD0 00000048 04000000 033033A5* *.0.....H.....03.*
0010 *00000100 0005000A 93214E97 00000000* *..........N.....*
0020 *00000002 00000020 00000000 00000000* *................*
0030 *002C0024 00000000 0045007B 00000000* *.........E......*
0040 *00000000 0F9C0BB0 0B640B44 0AFF0ABA* *.........d.D....*
0050 *0A750A30 09EB09A6 0961091C 08D70892* *.u.0.....a......*
0060 *084D0808 07C3077E 073906F4 06AF066A* *.M.......9.....j*
0070 *062505E0 059B0556 051104CC 04870442* *.......V.......B*
0080 *03FD03B8 0373032E 02E902A4 025F021A* *.....s..........*
0090 *01D50190 014B0106 00C1007C 00000000* *.....K..........*
00A0 *00000000 00000000 00000000 00000000* *................*
00B0 *00000000 00000000 00000000 00000000* *................*
00C0 *10000045 0100003D 00000028 00000000* *...E............*
00D0 *28007468 69732069 73203430 20202020* *..this.is.40....*
00E0 *20202020 20202020 20202020 20202020* *................*
00F0 *20202020 20202020 20202020 20202020* *................*
0100 *20202020 00100000 45010000 3D000000* *........E.......*
0110 *27000000 00270074 68697320 69732033* *.......this.is.3*
0120 *39202020 20202020 20202020 20202020* *9...............*
0130 *20202020 20202020 20202020 20202020* *................*
0140 *20202020 20202020 20001000 00450100* *.............E..*
0150 *003D0000 00260000 00002600 74686973* *............this*
0160 *20697320 33382020 20202020 20202020* *.is.38..........*
第一个Page Data Offset = 48,我们翻译成16进制就是0x30,那么对应到raw dump,我们可以猜它就是最开始的两个字节。
依此类推,在这里不再多说。
下面我们要讨论的是slot。
大家可能听说过RID,也很困惑RID是什么咚咚(只知道是指向某一个record的,但是不知道具体是怎么指的)。
一般说来,RID在v8里面是4字节,在v9里面的large tablespace(默认)是6字节。
4字节的版本是3字节page id加上1字节的slot。6字节版本是4字节page id加上2字节的slot。(也就是说在4字节版本中,一个表在一个分区里面最多可以有2^24页。在6字节版本里面,则是2^32页。乘以每个页的大小,就是表在一个分区中最大的尺寸)。
而slot则是表明在一个page里面,一行记录的起始offset。
大家可能要问,怎么能用1个字节来标明32K页中的offset呢?
在每一个数据页中,在page header之下会有一段空间来存储每一行所在的起始位置。每一个offset相当于一个指针,指向这个page里面所对应行的位置。然后这个 offset 在这个slot list里面相对的位置就是其slot number。
譬如说
char* stringList[100];
stringList[0]=”hello world”;
stringList[1]=”mystring”;
stringList[2]=”mystring1”;
实际上,上面这种大家经常用到的数据结构就是类似slot的结构。其中0,1,2代表这个slot number,每一个pointer都指向内存中的一个offset,对应这一个字符串。
让我们回到刚才的数据:
Slot 0:
Offset Location = 3996 (xF9C)
Record Length = 32 (x20)
Record Type = Data Object Header Control Record
Page count = 1787
Object Creation LSN = 0000 0106 AE3E
Object State = x0000
UDI Since Runstats = 99958
DFH flag bits = x00000000
Slot 1:
Offset Location = 2992 (xBB0)
Record Length = 1004 (x3EC)
Record Type = Free Space Control Record
Free space entries:
0: 36 (x0024), 52 (x0034), 52 (x0034), 52 (x0034)
4: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
8: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
12: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
16: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
20: 52 (x0034), 52 (x0034), 52 (x0034), 52 (x0034)
我们可以看到 slot0的offset location是xF9C。如果我们在/v h的输出里面找这两个字节,我们看到
0040 *00000000 0F9C0BB0 0B640B44 0AFF0ABA* *.........d.D....*
0050 *0A750A30 09EB09A6 0961091C 08D70892* *.u.0.....a......*
0060 *084D0808 07C3077E 073906F4 06AF066A* *.M.......9.....j*
0070 *062505E0 059B0556 051104CC 04870442* *.......V.......B*
0080 *03FD03B8 0373032E 02E902A4 025F021A* *.....s..........*
0090 *01D50190 014B0106 00C1007C 00000000* *.....K..........*
00A0 *00000000 00000000 00000000 00000000* *................*
0F9C就是第一个record所对应的offset,而0BB0就是第二个record的offset……以此类推。
不过这些offset不是page里面真实的offset,而是相对于slot[0]所对应的offset。
在上面的例子里,0F9C对应这个0x44,那么真实的offset也就是F9C+44=FE0
0FE0 *01000020 000006FB 00000000 0106AE3E* *................*
0FF0 *00018676 00000000 00000000 00000000* *...v............*
依此类推,比如第一个data record:
Slot 4:
Offset Location = 2815 (xAFF)
Record Length = 69 (x45)
Record Type = Table Data Record (FIXEDVAR)
Record Flags = 0
Fixed part length value = 61
Column 1:
Fixed offset: 0
Type is Long Integer
Value = 1
Column 2:
Fixed offset: 5
Type is Long Integer
Value = 1
Column 3:
Fixed offset: 10
Type is Fixed Length Character String
74686973 20697320 31202020 20202020 this is 1
20202020 20202020 20202020 20202020
20202020 20202020 20202020 20202020
2020
其offset为xAFF,在slot里面是x4C,但是AFF依然要加x44,也就是slot[0]的位置,则真实offset就是B43,定长x45,也就是
0B40 *20200010 00004501 00003D00 00000100* *......E.........*
0B50 *00000001 00746869 73206973 20312020* *.....this.is.1..*
0B60 *20202020 20202020 20202020 20202020* *................*
0B70 *20202020 20202020 20202020 20202020* *................*
0B80 *20202020 20202000 00000020 00000003* *................*
0B90 *00010004 00010000 00010004 00010005* *................*
0BA0 *01000032 0001000A 1000004C 00000201* *...2.......L....*
0BB0 *00000141 00000001 06000200 00000000* *...A............*
0BC0 *00000000 00000000 00000000 00000000* *................*
0BD0 *00000000 00000000 00000000 00000000* *................*
其余同理……
那怎么把这么一大段数据翻译成一个一个列呢?
在一个表的 page 0 slot 3里面,存放的是Table Descriptor Record,也就是这个表里面列的信息
Slot 3:
Offset Location = 2884 (xB44)
Record Length = 32 (x20)
Record Type = Table Description Record
Number of Columns = 3
Column 1:
Type is Long Integer
Length = 4
Allows NULLs
Prohibits Default
Fixed offset: 0
Column 2:
Type is Long Integer
Length = 4
Allows NULLs
Prohibits Default
Fixed offset: 5
Column 3:
Type is Fixed Length Character String
Length = 50
Allows NULLs
Prohibits Default
Fixed offset: 10
我们可以看到,这个表有3个column,前两个是Long Integer,4字节长度,允许NULL,第三个是定长char(50),允许NULL。
好,对应真正的数据
0B40 *20200010 00004501 00003D00 00000100* *......E.........*
0B50 *00000001 00746869 73206973 20312020* *.....this.is.1..*
0B60 *20202020 20202020 20202020 20202020* *................*
0B70 *20202020 20202020 20202020 20202020* *................*
0B80 *20202020 20202000 00000020 00000003* *................*
我们看到了什么?我们通过对比SELECT出来的第一行,知道前两个column是1,第三个是this is 1
那么相对应的,蓝色部分代表第一个column,最后一个字节00代表不是NULL,而黄色部分就是第二个column。剩下的就是一串50字节的字符串,最后的00是代表非NULL的字节。 (如果一个column是不允许NULL的,那么就会在数据里面每一行节省下一字节,不用标明是否为NULL咯)。
那么蓝色之前的红色代表什么?
看看
Offset Location = 2815 (xAFF)
Record Length = 69 (x45)
Record Type = Table Data Record (FIXEDVAR)
Record Flags = 0
Fixed part length value = 61
对比
10 00004501 00003D
应该能够明白点什么吧 :)
不多说了,这里抛砖引玉,具体用法大家具体摸索
:) :p
收起