场景:
oracle异机恢复需求
恢复过程报错信息:
channel t2: starting datafile backup set restore
channel t2: specifying datafile(s) to restore from backup set
channel t2: restoring datafile 00269 to /oldb2/tdp13online/fareslob_12.dbf
channel t2: restoring datafile 00296 to /oldb1/tdp13online/resv_small_109.dbf
channel t2: restoring datafile 00357 to /oldb2/tdp13online/resv_index_108.dbf
channel t2: restoring datafile 00643 to /oldb1/tdp13online/resv_small_265.dbf
channel t2: restoring datafile 00908 to /oldb3/tdp13online/resv_index_328.dbf
channel t2: restoring datafile 01024 to /oldb4/tdp13online/resv_index_389.dbf
channel t2: restoring datafile 01411 to /oldb5/tdp13online/resv_small_732.dbf
channel t2: restoring datafile 01413 to /oldb5/tdp13online/resv_small_734.dbf
channel t2: restoring datafile 01416 to /oldb5/tdp13online/resv_small_737.dbf
channel t2: restoring datafile 01418 to /oldb5/tdp13online/resv_small_739.dbf
channel t2: reading from backup piece dbfI_925617532_60891_1
channel t1: ORA-19870: error while restoring backup piece dbfI_925620833_60893_1
ORA-19501: read error on file "dbfI_925620833_60893_1", block number 1 (block size=512)
ORA-27190: skgfrd: sbtread2 returned error
ORA-19511: Error received from media manager layer, error text:
ANS1314E (RC14) File data currently unavailable on server
channel t2: starting datafile backup set restore
channel t2: specifying datafile(s) to restore from backup set
channel t2: restoring datafile 00021 to /oldb2/portal/mssffpidx03.dbf
channel t2: restoring datafile 00037 to /oldb2/orasys/gta.dbf
channel t2: restoring datafile 00065 to /oldb1/tdp13online/resv_small_3.dbf
channel t2: restoring datafile 00084 to /oldb2/tdp13online/resv_small_6.dbf
channel t2: restoring datafile 00537 to /oldb2/tdp13online/fareslob_30.dbf
channel t2: restoring datafile 00692 to /oldb1/tdp13online/resv_small_289.dbf
channel t2: restoring datafile 01067 to /oldb4/tdp13online/resv_small_352.dbf
channel t2: restoring datafile 01528 to /oldb4/tdp13online/resv_index_555.dbf
channel t2: restoring datafile 01529 to /oldb4/tdp13online/resv_index_556.dbf
channel t2: reading from backup piece dbfI_924859629_116027_1
channel t1: ORA-19870: error while restoring backup piece dbfI_924858703_116026_1
ORA-19507: failed to retrieve sequential file, handle="dbfI_924858703_116026_1", parms=""
ORA-27029: skgfrtrv: sbtrestore returned error
ORA-19511: Error received from media manager layer, error text:
ANU2614E Invalid sequence of function calls to Data Protection for Oracle
# cat restore_20161026.log |grep "ORA-19501"
ORA-19501: read error on file "dbfI_925620833_60893_1", block number 1 (block size=512)
ORA-19501: read error on file "dbfI_925617532_60891_1", block number 263941633 (block si
ORA-19501: read error on file "dbfI_925623984_60895_1", block number 1 (block size=512)
ORA-19501: read error on file "dbfI_925627205_60897_1", block number 1 (block size=512)
ORA-19501: read error on file "dbfI_925624880_60896_1", block number 1 (block size=512)
ORA-19501: read error on file "dbfI_925629857_60899_1", block number 1 (block size=512)
ORA-19501: read error on file "dbfI_925627871_60898_1", block number 1 (block size=512)
ORA-19501: read error on file "dbfI_925630752_60900_1", block number 1 (block size=512)
ORA-19501: read error on file "dbfI_925621388_60894_1", block number 9862657 (block size
ORA-19501: read error on file "dbfI_925633018_60901_1", block number 1 (block size=512)
ORA-19501: read error on file "dbfI_925633573_60902_1", block number 1 (block size=512)
诊断过程:
1 检查恢复过程当中报错的oracle文件是否在tsm server数据库存在
tsm: TSM>select contents.volume_name from contents,backups where contents.object_id=backups.object_id and backups.ll_name='dbfI_925542002_59492_1'
VOLUME_NAME: A99516
tsm: TSM>select contents.volume_name from contents,backups where contents.object_id=backups.object_id and backups.ll_name='dbfI_925620833_60893_1'
VOLUME_NAME: A99536
tsm: TSM>select contents.volume_name from contents,backups where contents.object_id=backups.object_id and backups.ll_name='dbfI_925617532_60891_1'
VOLUME_NAME: A99534
VOLUME_NAME: A99536
tsm: TSM>select contents.volume_name from contents,backups where contents.object_id=backups.object_id and backups.ll_name='dbfI_925623984_60895_1'
VOLUME_NAME: A99536
tsm: TSM>select contents.volume_name from contents,backups where contents.object_id=backups.object_id and backups.ll_name='dbfI_925627205_60897_1'
VOLUME_NAME: A99536
2 查看其中一盘磁带上存储的内容:
tsm: TSM>q content A99536
Node Name Type Filespace FSID Client's Name for File
Name
--------------- ---- ---------- ---- -----------------------------------
node1 Bkup /ebtdp13o- 1 //dbfI_925617532_60891_1
nline_fs
node1 Bkup /ebtdp13o- 1 //dbfI_925620833_60893_1
nline_fs
node1 Bkup /ebtdp13o- 1 //dbfI_925623984_60895_1
nline_fs
node1 Bkup /ebtdp13o- 1 //dbfI_925627205_60897_1
nline_fs
node1 Bkup /ebtdp13o- 1 //dbfI_925629857_60899_1
确实有我们需要恢复时所要用到的文件
3 检查磁带状态
tsm: TSM>q volume A99536 f=d
Volume Name: A99536
Storage Pool Name: ONLINEPOOL
Device Class Name: LTO3
Estimated Capacity: 399.9 G
Scaled Capacity Applied:
Pct Util: 100.0
Volume Status: Full
Access: Unavailable
Pct. Reclaimable Space: 0.0
Scratch Volume?: Yes
In Error State?: No
Number of Writable Sides: 1
Number of Times Mounted: 5
Write Pass Number: 1
Approx. Date Last Written: 10/19/2016 08:17:38
Approx. Date Last Read: 10/19/2016 04:54:29
Date Became Pending:
Number of Write Errors: 0
Number of Read Errors: 0
Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator):
Last Update Date/Time: 10/19/2016 04:53:51
Begin Reclaim Period:
End Reclaim Period:
Drive Encryption Key Manager: None
Logical Block Protected: No
发现磁带状态为access是Unavailable,这个是数据读取时的报错原因
4. 检查其他磁带是否也是Unavailable
tsm: TSM>q vol access=unav
Volume Name Storage Device Estimated Pct Volume
Pool Name Class Name Capacity Util Status
------------------------ ----------- ---------- --------- ----- --------
A99456 MOBILE_POOL LTO3 762.9 G 1.9 Filling
A99464 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99469 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99536 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99537 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99554 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99555 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99556 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99557 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99558 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99564 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99567 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99568 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99569 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99571 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99572 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99573 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99576 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99577 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99578 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99579 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99581 ONLINEPOOL LTO3 399.9 G 100.0 Full
A99583 ONLINEPOOL LTO3 399.9 G 100.0 Full
tsm 带库中的磁带有少部分状态为Unavailable,这也是恢复读取报错的原因所在
5. 手工更新磁带的访问模式为正常
update vol A99456 access=readwrite
update vol A99464 access=readwrite
update vol A99469 access=readwrite
update vol A99536 access=readwrite
update vol A99537 access=readwrite
update vol A99554 access=readwrite
update vol A99555 access=readwrite
update vol A99556 access=readwrite
update vol A99557 access=readwrite
update vol A99558 access=readwrite
update vol A99564 access=readwrite
update vol A99567 access=readwrite
update vol A99568 access=readwrite
update vol A99569 access=readwrite
update vol A99571 access=readwrite
update vol A99572 access=readwrite
update vol A99573 access=readwrite
update vol A99576 access=readwrite
update vol A99577 access=readwrite
update vol A99578 access=readwrite
update vol A99579 access=readwrite
update vol A99581 access=readwrite
update vol A99583 access=readwrite
更新完毕后再检查磁带状态已经正常
再次进行恢复,OK!
总结:TSM结合虚拟带库有可能是兼容性方面的问题导致磁带状态异常,后续做好监控工作,尽量避免此类问题