这个问题很急, 以下内容为环境描述和自己分析的内容, 希望大家慷慨解答, 谢谢!!!!
------------------------------------------------------------------------------------------------------
环境描述:
1. TSM Server: 6.2.7(RHEL6.5)2. TDPO:5.5(RHEL7)3. Oracle为RAC环境, 版本为11.2.0.4.0, 数据库容量为5T4. TSM Server和San Agent均设置COMMTIMEOUT 60000 和IDLETIMEOUT 60005. TDPO节点也设置了maxnummp=3 txngroupmax=10006. TDPO客户端的系统sysctl参数:
fs.suid_dumpable = 1fs.aio-max-nr = 1048576fs.file-max = 6815744kernel.shmall = 33554432kernel.shmmax = 137438953472kernel.shmmni = 4096kernel.sem = 250 32000 100 128net.ipv4.ip_local_port_range = 9000 65500net.core.rmem_default = 1048576net.core.rmem_max = 4194304net.core.wmem_default = 262144net.core.wmem_max = 1048576vm.min_free_kbytes = 524288
7. 备份脚本:
run {allocate channel t1 type 'sbt_tape' parms 'ENV=(tdpo_optfile=/usr/tivoli/tsm/client/oracle/bin64/tdpo.opt)';set limit channel t1 kbytes 1900000;allocate channel t2 type 'sbt_tape' parms 'ENV=(tdpo_optfile=/usr/tivoli/tsm/client/oracle/bin64/tdpo.opt)';set limit channel t2 kbytes 1900000;backup incremental level 0 skip inaccessible format='db_%d_%u_%s_%T'(database include current controlfile);sql 'alter system archive log current';backup format '%t%s%d.dbf' archivelog all delete input;release channel t1;release channel t2;}8. 备份成功了一部分文件, 容量大约600G.
--------------------------------------------------------------------------------------------------------------
故障描述:
RMAN报错:
RMAN-03009: failure of backup command on t2 channel at 09/01/2016 03:46:49ORA-19513: failed to identify sequential fileORA-27206: requested file not found in media management catalogORA-19502: write error on file "db_QRCODE_99rem2rr_297_20160901",block number 95488865 (block size=8192)ORA-27030: skgfwrt: sbtwrite2 returned errorORA-19511: Error received from media manager layer, error text:ANS1235E (RC-72) An unknown system error has occurred from which TSM cannot recover.
TSM报错:
ANR0538I A resource waiter has been aborted.ANR0490I Canceling session 187 for nodeQRDB1_DB (TDPO LinuxAMD64) . (SESSION: 143)ANR0524W Transaction failed for session 187 for node QRDB1_DB (TDPO LinuxAMD64) - data transferinterrupted.ANR0483W Session 187 for node QRDB1_DB (TDPO LinuxAMD64) terminated - forced by administrator.
TDPO报错:
cuConfirm: Received rc: -72 trying to receive ConfirmResp verbANS1235E An unknown system error has occurred from which TSM cannot recover.ANS1235E An unknown system error has occurred from which TSM cannot recover.sessSendVerb: Error sending Verb, rc: -71ANS4994S TDP for Oracle: (86948): =>(qrdb1_db) ANU2602E The object /adsmorc//db_QRCODE_99rem2rr_297_20160901 was not found on the TSM Server TDPO LinuxAMD64 ANU0599
Q stgpool:
tsm: TSMSERVER>q stgpoolSession established with server TSMSERVER: Linux/x86_64 Server Version 6, Release 2, Level 7.0 Server date/time: 09/02/2016 11:48:05 Last access: 09/02/2016 10:45:22Storage Device Estimated Pct Pct High Low Next Stora-Pool Name Class Name Capacity Util Migr Mig Mig ge Pool Pct Pct ----------- ---------- ---------- ----- ----- ---- --- -----------ARCHIVEPOOL DISK 0.0 M 0.0 0.0 90 70 BACKUPPOOL DISK 0.0 M 0.0 0.0 90 70 DB_STGP LTO_DEV1 91,553 G 1.3 13.3 90 70 FILE_STGP LTO_DEV0 73,242 G 0.0 33.3 90 70 SPACEMGPOOL DISK 0.0 M 0.0 0.0 90 70
-----------------------------------------------------------------------------------------------------------------
自己判断结果:
1. 环境hosts记录正常;2. ./sbttest test 结果正常, 为:The sbt function pointers are loaded from libobk.so library.
-- sbtinit succeeded
3. 带库为TS3100双驱动, 硬件正常, TSM备份的时候没有报相关硬件问题; 4. stgpool空间足够, path正常.
收起