警告:由于aix的操作系统patch导致的Oracle数据库块损坏问题

本资料无预览

如感兴趣请购买后下载

立即下载

资料简介:
摘自:
ALERT: Database Corruption ORA-600 ORA-7445 errors after applying AIX SP patches - AIX 6.1.9.8 or AIX 7.1.3.8 or AIX 7.1.4.3 or AIX 7.2.0.3 or AIX 7.2.1.0, 01 (Doc ID 2237498.1)
APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.3 to 12.2.0.1 [Release 11.2 to 12.2]
IBM AIX on POWER Systems (64-bit)
A problem has been discovered in the latest SP patches for IBM AIX 6.1 and 7.1 (SP 08 and SP 03) where 11.2.0.3, 11.2.0.4, or 12.1 or 12.2 are running. ORA-600 errors and possible database corruption.

upgrade from AIX 6.1.9.7 to SP08
upgrade from AIX 7.1.4.2 to SP03
or running on one of the oslevels listed below in this note.

This is only known to impact Oracle 11.2.0.3.x, 11.2.0.4.x, 12.1.0.2, or 12.2.0.1 on AIX platforms. It has been observed on various Oracle PSU versions.
The symptoms observed so far are ORA-600 memory related failures with examples below.
Additionally, Redo log corruption has been observed in at least two cases.

DESCRIPTION
Database Corruption and/or ORA-600 ORA-7445 errors after applying IBM AIX SP patches - After update from AIX 6.1.9.7 to SP08 or AIX 7.1.4.2 to SP03 (note the earlier service packs (SP 07 or SP 02 are not impacted)
OCCURRENCE
The only changes were upgrades to the latest IBM SP patches.
upgraded from AIX 6.1.9.7 to SP08  --> SP08 has the problem.
upgraded from AIX 7.1.4.2 to SP03  --> SP03 has the problem.
To check for AIX patch levels that are exposed to this risk, run the following command and look for any of the following:
# oslevel -s
If any of the following are listed, exposure to this problem exists:
6100-09-08
7100-03-08
7100-04-03
7200-00-03
7200-01-00
7200-01-01


SYMPTOMS
The following ORA-600 errors have been observed. Note that not all errors are needed, and not all customers have seen all these errors.
=========================================================================================
ORA-00600: internal error code, arguments: [kkoipt:invalid aptyp], [0], [0], [], [], [], [], [], [], [], [], []
Optimizer - Maps the structures from memory
=========================================================================================
ORA-00600: internal error code, arguments: [kghssgai2], [1], [32], [], [], [], [], [], [], [], [], []
--looks to be pga related allocations
Generic memory Heap manager -we can't have both a heap and an allocation function passed in to us
=========================================================================================
ORA-00600: internal error code, arguments: [qkkAssignKey:1], [], [], [], [], [], [], [], [], [], [], []
qkkAssignKey - copy keys from source to destination key
=========================================================================================
ORA-00600: internal error code, arguments: [kclgclks_3], [454], [2431642561], [], [], [], [], [], [], [], [], []
kclgclks - CR Server request
=========================================================================================
ORA-00600: internal error code, arguments: [kkqvmRmViewFromLst1], [], [], [], [], [], [], [], [], [], [], []
View Merging - list management
=========================================================================================
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_1], [0x082024000], [rpi role space], [], [], [], [], [], [], [], [], []
shared heap manager Stack segment underflow, failure to follow stack discipline.
assert no previous chunk in this segment
=========================================================================================
ORA-00600: internal error code, arguments: [qerghFetch.y], [], [], [], [], [], [], [], [], [], [], []
Implements hash aggregation for query source
=========================================================================================
ORA-00600: internal error code, arguments: [qeshQBNextLoad.1], [], [], [], [], [], [], [], [], [], [], []
Hash Table Infrastructure -get Next buffer during Load
=========================================================================================
ORA-00600: internal error code, arguments: [qkshtQBGet:1], [], [], [], [], [], [], [], [], [], [], []
gets memory pointer for a query block.
Make sure the query block pointer is not NULL
=========================================================================================
ORA-00600: internal error code, arguments: [qeshIHBuildOnPartition block missed], [], [], [], [], [], [], [], [], [], [], []
Hash Table Infrastructure
update the partition at the end.
=========================================================================================
ORA-00600: internal error code, arguments: [kghssgfr2], [1]
=========================================================================================
ORA-07445: exception encountered: core dump [PC:0x0] [SIGILL] [ADDR:0x0] [PC:0x0] [Illegal opcode]
=========================================================================================
ORA-00600 [kkogbro: no kkoaptyp]
=========================================================================================
ORA-00600: internal error code, arguments: [kewrose_1], [600]
========================================================================================
ORA-00600: internal error code, arguments: [1868], [0x000000000], [], [], [], [], [], [], [], [], [], []
Core dumps are also possible.
---------------
Redo log corruption with checksum error has also been observed.
Two known examples below:
example 1:
Alert.log messages:
ORA-00368: checksum error in redo log block
ORA-00353: log corruption near block 73804 change 8112409541614 time 12/07/2016 07:12:25
ORA-00334: archived log: '/dev/rredo13'
ORA-07445: exception encountered: core dump [pkrdi()+780] [SIGSEGV] [ADDR:0x0] [PC:0x10367B26C] [Invalid permissions for mapped

---------------
There have been also transient database block corruptions or control file block corruption with checksum errors in the database where a reread finds valid data.
example 2 (transient database block corruption with checksum error):
Corrupt block relative dba: 0x5a066b2f (file 360, block 420655)
Bad check value found during buffer read
Data in bad block:
type: 6 format: 2 rdba: 0x5a066b2f
last change scn: 0x00cc.6a826294 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x62940601
check value in block header: 0x9e7d
computed block checksum: 0x0           ---> 0x0 means that checksum is good when printing the error message (transient problem)
Reading datafile 'Datafile name' for corruption at rdba: 0x5a066b2f (file 360, block 420655)
Reread (file 360, block 420655) found valid data
Hex dump of (file 360, block 420655) in trace file ....
Repaired corruption at (file 360, block 420655)

example 3 (transient control file corruption with checksum error):
Hex dump of (file 0, block 1) in trace file ...
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Bad check value found during control file header read
Data in bad block:
type: 21 format: 2 rdba: 0x00000001
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00001501
check value in block header: 0xca35
computed block checksum: 0x0                 ---> 0x0 means that checksum is good when printing the error message (transient problem)
Errors in file ..:
ORA-00202: control file: '/oracle/dbs/control_01.ctl'
Errors in file ...
ORA-00227: corrupt block detected in control file: (block 1, # blocks 1)
ORA-00202: control file: '/oracle/dbs/control_01.ctl'


WORKAROUND
The current solution is to revert back to the previous SP patches on the AIX host.

PATCHES
The fix is now ready from IBM
It can be downloaded for the above releases via:
ftp://aix.software.ibm.com/aix/ifixes/

Affected AIX Levels     Fixed In           iFix / APAR (ftp://aix.software.ibm.com/aix/ifixes/)
6100-09-08               6100-09-09      IV93840
7100-03-08               7100-03-09      IV93884
7100-04-03               7100-04-04      IV93845
7200-00-03               7200-00-04      IV93883
7200-01-01               7200-01-02      IV93885
The fix is included in the next to be released AIX Service Packs.
IBM HIPER APAR
Abstract: PROBLEMS CAN OCCUR WITH THREAD_CPUTIME AND THREAD_CPUTIME_FAST
This APAR corrects an issue with system call thread_cputime_self with floating point registers which is exposed by Oracle Database 11gR2.
PROBLEM SUMMARY:
The thread_cputime or thread_cputime_fast interfaces can
cause invalid data in the FP/VMX/VSX registers if the thread
page faults in this function
For more information see the following from IBM:
http://www-01.ibm.com/support/do ... X71HIPER170303-1247


HISTORY
24-FEB-2017 Article has been created
28-FEB-2017 Added the transient control file corruption example with ORA-227 and added the next patch levels that are also causing this issue: 7100-03-08, 7200-00-03, 7200-01-00, 7200-01-01
02-MAR-2017 Added the AIX fix detail APAR -IV93763 and ftp site at IBM for patch downloads as well as fix for each AIX version
14-MAR-2017 Clarified some detail as to what service packs are problems.  Added link to IBM problem description.
15-MAR-2017 Expanded scope of the database versions at risk (12.1 and 12.2 are also at risk)
2017-05-04
浏览630
下载0

已下载用户的评价

您还未下载该资料,不能发表评价;
查看我的 待评价资源
本资料还没有评价。

贡献者

jiaxu2000系统工程师,沈阳医学院附属中心医院
X社区推广