haizdl
作者haizdl·2023-03-10 15:07
技术经理·大连

DBA如何定制自动化巡检工具

字数 49159阅读 2844评论 2赞 14

DBA如何定制自动化巡检工具

## 一、逐步实现自动化的基本步骤

1.手动巡检

最初阶段,多数DBA习惯于将常用的SQL语句总结记录下来,在日常巡检的时候,会在数据库中执行常用的SQL语句,然后以抓屏方式将结果复制出来,然后逐条检查分析。这种方法只适合于管理少量数据库,一旦数据库数量增加,工作量会非常大。

2.脚本巡检

升级阶段,具备脚本语言开发功底的DBA会逐步将常用的SQL语句编制为固定SQL脚本,然后通过操作系统可执行的脚本语言(例如:KSH、BASH),通过Shell脚本去调用SQL脚本,并把执行结果写入日志文件,后期通过日志文件去进行检查分析。这种方法可管理较多数据库,但是需要频繁切换到不同的数据库服务器,容易误操作,尤其是操作系统平台不统一的场景。

3.自动化巡检

最终阶段,具备Python、Expect等语言开发功底的DBA会逐步将常前面的成果物积淀下来,然后通过Python脚本的方式去自动调用各个系统的Shell脚本,从而批量完成所有数据库的巡检过程,并将所有数据库的巡检日志传输至集中位置。再通过文本过滤工具(例如:AWK)或者VBA对日志内容进行过滤筛选,最后对过滤后的内容进行最终分析。这种方法可以通过集中管理服务器对所有数据库服务器进行批量操作,并且消除了操作系统平台不一致带来的问题。

二、需要获取的数据库诊断信息

1. 基本信息

基本信息根据不同的数据库环境会有所差异,但是有些最基本的信息是必须的。如数据库名称、实例名称、唯一标识信息、系统版本、地址信息、数据库创建时间、数据库启动时间、数据库版本信息、数据库补丁信息、字符集、实例状态、数据库归档模式等。以下脚本(for Oracle)供参考:

SELECT name AS db_name,
       dbid,
       created,
       platform_name,
       db_unique_name,
       log_mode,
       guard_status,
       force_logging,
       flashback_on
  FROM v$database;

SELECT instance_name,
       inst_id,
       host_name,
       TO_CHAR(startup_time, 'YYYY/MM/DD HH24:MI:SS') AS startup_time,
       status,
       SYSDATE - STARTUP_TIME AS Running_Time
  FROM gv$instance
 ORDER BY inst_id;

2. 存储信息

存储信息包括逻辑存储信息和物理存储信息,具体包含数据库的表空间、数据文件、磁盘三个维度。表空间需要收集表空间名称,类型,区管理方式,区初始大小,段的管理方式,总大小,剩余大小,剩余百分比;数据文件需要收集对应的表空间,是否自动扩展,当前值,最大值等;磁盘信息需要收集磁盘组的序号、名称、冗余类型、总大小、剩余大小、使用频繁的数据大小、镜像需要的空间、可用于镜像的大小、offline的磁盘数量等。以下脚本(for Oracle)供参考:

SELECT c.tablespace_name,
       c.contents,
       c.extent_management ext_mgmt,
       c.allocation_type alloc_type,
       c.initial_extent / 1024 ext_kb,
       c.segment_space_management SSM,
       nvl(a.total_gb, 0) total_gb,
       decode(c.contents, 'TEMPORARY', nvl(a.total_gb, 0) - c.block_size * b.used_blocks / 1073741824, nvl(b.free_gb, 0)) free_gb,
       100 *  decode(c.contents, 'TEMPORARY', nvl(a.total_gb, 0) - c.block_size * b.used_blocks / 1073741824, nvl(b.free_gb, 0)) / nvl(a.total_gb, 1) free_pct
  FROM (SELECT tablespace_name, nvl(sum(bytes) / 1073741824, 0) total_gb
          FROM dba_data_files
         GROUP BY tablespace_name
         UNION all
        SELECT tablespace_name, nvl(sum(bytes) / 1073741824, 0) total_gb
          FROM dba_temp_files
         GROUP BY tablespace_name) a,
       (SELECT tablespace_name,
               nvl(sum(bytes) / 1073741824, 0) free_gb,
               0 used_blocks
          FROM dba_free_space
         GROUP BY tablespace_name
         UNION all
        SELECT tablespace_name,
               0 free_gb,
               nvl(sum(used_blocks), 0) used_blocks
          FROM gv$sort_segment
         GROUP BY tablespace_name) b,
       dba_tablespaces c
 WHERE c.tablespace_name = b.tablespace_name(+)
   AND c.tablespace_name = a.tablespace_name(+)
 ORDER BY c.contents, free_pct, c.tablespace_name;
 
 SELECT file_id,
       file_name,
       tablespace_name,
       autoextensible,
       bytes / 1073741824 as current_gb,
       maxbytes / 1073741824 as max_gb
  FROM dba_data_files
 UNION
SELECT file_id,
       file_name,
       tablespace_name as ts_name,
       autoextensible,
       bytes / 1073741824 as cur_gb,
       maxbytes / 1073741824 as max_gb
  FROM dba_temp_files
 ORDER BY tablespace_name, file_id, file_name;
 
 SELECT group_number,
       name,
       type,
       total_mb,
       free_mb,
       hot_used_mb,
       required_mirror_free_mb,
       usable_file_mb,
       offline_disks 
  FROM v$asm_diskgroup_stat;

3. 日志信息

日志信息主要是对联机日志、快速恢复日志相关动态信息的统计分析。具体包括联机日志的组号、线程号、序列号、文件路径、大小、状态、归档状态、组内成员数量;归档日志产生日期,产生的大小,和文件数等;每天日志切换的量以及切换的频率。以下脚本(for Oracle)供参考:

SELECT t2.member,
       t1.group#,
       t1.thread#,
       t1.sequence#,
       t1.bytes / 1024 / 1024 AS SIZE_MB,
       t1.status,
       t1.archived,
       t1.members
  FROM v$log t1, v$logfile t2
 WHERE t1.group# = t2.group#
 ORDER BY thread#, group#;
 
 SELECT lpad(to_char(first_time, 'yyyymmdd'), 12) "DATE",
       trunc(sum(blocks * block_size) / 1024 / 1024) "SIZE(MB)",
       count(*)
  FROM v$archived_log
 WHERE first_time > sysdate - 6
   AND creator = 'ARCH'
 GROUP BY lpad(to_char(first_time, 'yyyymmdd'), 12)
 ORDER BY 1;
 
 SELECT SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) Day,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'00',1,0)) H00,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'01',1,0)) H01, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'02',1,0)) H02,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'03',1,0)) H03,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'04',1,0)) H04,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'05',1,0)) H05,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'06',1,0)) H06,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'07',1,0)) H07,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'08',1,0)) H08,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'09',1,0)) H09,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'10',1,0)) H10,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'11',1,0)) H11, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'12',1,0)) H12,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'13',1,0)) H13, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'14',1,0)) H14,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'15',1,0)) H15, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'16',1,0)) H16, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'17',1,0)) H17, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'18',1,0)) H18, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'19',1,0)) H19, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'20',1,0)) H20, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'21',1,0)) H21,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'22',1,0)) H22 , 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'23',1,0)) H23, 
       COUNT(*) TOTAL 
  FROM v$log_history  a  
 WHERE first_time>=to_char(sysdate-7)
 GROUP BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) 
 ORDER BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) DESC;
 
 SELECT thread#, 
       sequence#, 
       to_char(first_time, 'MM/DD/RR HH24:MI:SS')
  FROM v$log_history
 WHERE thread# = 1
   AND first_time > sysdate - 1
 ORDER BY first_time DESC;

SELECT thread#, 
       sequence#, 
       to_char(first_time, 'MM/DD/RR HH24:MI:SS')
  FROM v$log_history
 WHERE thread# = 2
   AND first_time > sysdate - 1
 ORDER BY first_time DESC;

4. 备份信息

备份信息主要是查看数据库备份任务的执行情况。主要包括备份任务的会话信息标识,备份内容,开始时间,结束时间,消耗时间。以下脚本(for Oracle)供参考:

SELECT session_key,
       start_time,
       end_time,
       status,
       time_taken_display tt
  FROM v$rman_backup_job_details
 WHERE start_time > sysdate -1
 ORDER BY session_key;

5. 性能信息

数据库性能诊断分析的基本目标分两个方面,一方面要看数据库本身跟性能有关的资源使用情况是否正常,另外一方面就是要抓取异常的会话、事件、SQL等。因此,这部分内容首先要收集数据库缓存使用情况,然后要收集异常会话以及相关进程的资源参数,24小时内CPU等待最长的事件,执行解析最多的SQL语句等。以下脚本(for Oracle)供参考:

SELECT free_space,
       avg_free_size,
       used_space,
       avg_used_size,
       request_failures,
       last_failure_size 
  FROM v$shared_pool_reserved;
  
  SELECT *
  FROM (SELECT t.sid,
               t.serial#,
               trunc(sysdate - logon_time) AS online_time,
               t.PROGRAM,
               t.status,
               t.LOGON_TIME,
               t.sql_id,
               t.prev_sql_id,
               t.event
          FROM gv$session t
         WHERE t.type <> 'BACKGROUND' AND program is not null
         ORDER BY logon_time)
 WHERE rownum <= 30;
 
 SELECT * 
  FROM gv$resource_limit  
 WHERE trim(limit_value) != 'UNLIMITED';
 
 SELECT t.*, s.sid, s.serial#, s.machine, s.program, s.osuser
  FROM (SELECT c.USERNAME, a.event, to_char(a.cnt) as seconds, a.sql_id, dbms_lob.substr(b.sql_fulltext, 100, 1) sqltext
          FROM (SELECT rownum rn, t.*
                  FROM (SELECT decode(s.session_state, 'WAITING', s.event, 'Cpu + Wait For Cpu') Event, s.sql_id, s.user_id, count(*) cnt
                          FROM v$active_session_history s
                         WHERE sample_time > sysdate - 1
                         GROUP BY s.user_id, decode(s.session_state, 'WAITING', s.event, 'Cpu + Wait For Cpu'), s.sql_id
                         ORDER BY cnt DESC) t
                 WHERE rownum < 20) a, v$sqlarea b, dba_users c
         WHERE a.sql_id = b.sql_id AND a.user_id = c.user_id
         ORDER BY cnt DESC) t, v$session s
 WHERE t.sql_id = s.sql_id(+);
 
 SELECT *
  FROM (SELECT sql_id, sql_text, s.executions, s.last_load_time, s.first_load_time, s.disk_reads, s.buffer_gets, s.parse_calls
          FROM v$sql s
      ORDER BY s.executions DESC)
 WHERE rownum <= 10;
 
 SELECT *
  FROM (SELECT sql_id, sql_text, s.EXECUTIONS, s.LAST_LOAD_TIME, s.FIRST_LOAD_TIME, s.DISK_READS, s.BUFFER_GETS, s.PARSE_CALLS
          FROM v$sql s
      ORDER BY s.PARSE_CALLS DESC)
 WHERE rownum <= 10;

6. 容灾信息

所谓容灾信息就是指数据库有容灾配置模式的场景,比如Oracle的Data Guard。如果有相关的配置,则需要检查主备库的同步是否异常,主要通过归档日志的同步信息来判断分析。这个时候需要获取归档目标名称、状态、数据库当前模式、目的地路径等静态信息,需要获取每个节点应用日志和归档日志的最大号,SCN最大值和最小值等相关信息。以下脚本(for Oracle)供参考:

SELECT dest_name,
       status,
       database_mode,
       destination  
  FROM v$archive_dest_status 
 WHERE dest_id in ('1','2');
 
 SELECT m.thread#, 
       m.sequence#, 
       first_change#, 
       next_change#
  FROM v$log_history m,
       (SELECT thread#, max(sequence#) as sequence#
          FROM v$log_history
      GROUP BY thread#) t
 WHERE m.thread# = t.thread#
   AND m.sequence# = t.sequence#;
   
   SELECT UNIQUE thread# AS thread, 
       MAX(sequence#) OVER (PARTITION BY thread#) AS last 
  FROM v$archived_log;

三、实现自动化脚本集成

1. 将基本SQL文转换成可用SQL脚本

通过手动执行SQL命令的方式,查询结果可以展现在屏幕上。但是以脚本后台模式执行的时候,就需要将查询结果格式化之后输入到结果文件当中。可以采用文本文件或者HTML文件。采用文件文件的时候,需要对输出结果进行美观易读方面的格式化,想再升级为HTML文件时,SQL脚本中要加HTML头。例如以下是格式化之前和之后的对比:

PROMPT <p><h1 class="awr">XX公司数据库巡检报告</h1>
PROMPT <p><a class="awr">数据库基本信息 </a>
COLUMN log_mode FOR a12 
COLUMN guard_status FOR a10
COLUMN force_logging FOR a15
COLUMN flashback_on  FOR a15
COLUMN db_unique_name FOR a10
COLUMN platform_name FOR a20
SELECT name AS db_name,
       dbid,
       created,
       platform_name,
       db_unique_name,
       log_mode,
       guard_status,
       force_logging,
       flashback_on
  FROM v$database;

2. 通过Shell脚本调用SQL脚本

如果单纯通过Shell脚本完成对SQL脚本的调用还是有些单薄。因此Shell脚本的设计需要完成两方面的功能,一方面需要收集操作系统相关的诊断信息,另外一方面要定义SQL脚本执行的初始化参数,完成对SQL脚本的调用执行。对于操作系统级别的诊断信息收集,无非是在Shell当中调用操作系统命令,如:uptime、vmstat、free、df、sar、iostat;对于SQL脚本的调用,需要完善相关初始化信息,如位置信息(脚本位置、输入文件位置、数据库安装目录等相关信息),时间信息(执行日期、时间),连接必要信息(用户、数据库名称、标识等)。 以下调用过程脚本片段供参考:

function _dbinfo(){
cd $PWDDIRECTORY
su - oracle -c "cd $LOGDIR/db;export ORACLE_SID=$1;sqlplus -s /nolog <<EOF
conn / as sysdba;
@$LOGDIR/../dailyhealthycheck_withadg.sql
exit;
EOF"
}

3. 通过Python脚本实现集中自动化调用

对于这个步骤实现的功能有两个重点,一个是完成集中化的优势,一个是完成自动化交互的功能。集中化的优势需要在循环语句当中完成。自动化交互的功能本来就是Python之类脚本语言的优势。只是我们在调用的时候需要考虑的初始化参数的定义和确认。 以下调用过程脚本片段供参考:

def ssh_cmd(ip, user, passwd, cmd):
    ret = -1
    ssh = pexpect.spawn('ssh %s@%s "%s"' % (user,ip, cmd),timeout=120)
    try:
        i = ssh.expect(['password:', 'continue connecting (yes/no)?'], timeout=5)
        if i == 0 :
            ssh.sendline(passwd)
        elif i == 1:
            ssh.sendline('yes\\n')
            ssh.expect('password: ')
            ssh.sendline(passwd)
        ssh.sendline(cmd)
        r = ssh.read()
    print (r)
        ret = r
    except pexpect.EOF:
        print ("EOF")
        ssh.close()
        ret = -1
    except pexpect.TIMEOUT:
    print ("TIMEOUT")
    ssh.close()
    ret = -2
    return ret

4. 工具化产品化打造过程

大部分数据库管理员设计脚本工具都是为了便于自己的日常管理工作,基本上不会考虑到工具的可维护性、健壮性、规范化以及未来的扩展性。如果想让自己辛苦设计出来的脚本最终可以转化为自动化运维工具,那么还要考虑到以下三个方面的问题:

(1). 遵循开发者规范,将脚本的设计趋向于标准化、模块化。例如脚本注释标准化,利用函数模块化设计,变量定义规范化。
(2). 每一个执行步骤要考虑到执行前的确认步骤,执行后的反馈步骤,执行中的异常处理场景。
(3). 所有参数输入尽量采用变量化设计,所有常量以文件方式隔离出脚本本身。

四、参考脚本

-- -----------------------------------------------------------------------------------
-- File Name    : dailyhealthycheck_withadg_pdb.sql
-- Author       : haizdl@126.com
-- Description  : Daily Healthy Checking for Oracle Rac Database.
-- Requirements : Access to the V$ views.
-- Call Syntax  : by script "healthyCheck_forLinux.sh" or "healthyCheck1_forAIX.sh"
-- Last Modified: 23/05/2017
-- -----------------------------------------------------------------------------------
SET MARKUP HTML ON SPOOL ON ENTMAP OFF PREFORMAT OFF
SET TERM OFF
SET HEADING ON
SET VERIFY OFF
SET FEEDBACK OFF
SET LINE 55555
SET PAGES 999999
SET LONG 999999999
SET LONGCHUNKSIZE 999999

-- -----------------------------------------------------------------------------------
-- SECTION: 巡检脚本初始化
-- -----------------------------------------------------------------------------------
COLUMN dbid new_value spool_dbid
COLUMN inst_num new_value spool_inst_num
SELECT dbid 
  FROM v$database 
 WHERE rownum = 1;

SELECT instance_number AS inst_num 
  FROM v$instance 
 WHERE rownum = 1;

COLUMN spoolfile_name new_value spoolfile
SELECT 'spool_'||(SELECT instance_name FROM v$instance WHERE rownum=1)||'_'||TO_CHAR(SYSDATE,'yy-mm-dd_hh24.mi')||'_daily' AS spoolfile_name 
  FROM dual;
spool &&spoolfile..html


PROMPT <p><h1 class="awr">XXX数据库日巡检报告</h1>
-- -----------------------------------------------------------------------------------
-- SECTION: 数据库基本信息
-- -----------------------------------------------------------------------------------
PROMPT <p><h3 class="awr">数据库基本信息汇总</h3>

/*
数据库标识、数据库名、创建日期、平台名称、唯一名称、归档模式等信息。
*/
PROMPT <p><a class="awr">数据库状态 </a>

COLUMN log_mode FOR a12 
COLUMN guard_status FOR a10
COLUMN force_logging FOR a15
COLUMN flashback_on  FOR a15
COLUMN db_unique_name FOR a10
COLUMN platform_name FOR a20
SELECT name AS db_name,
       dbid,
       created,
       platform_name,
       db_unique_name,
       log_mode,
       guard_status,
       force_logging,
       flashback_on
  FROM v$database;
  
/*
实例的序号、名称、主机名、启动时间、状态、运行时间等。
*/
PROMPT <p><a class="awr">数据库实例基本信息</a>

CLEAR COLUMNS
SET LINE 200
COLUMN host_name FOR A50
SELECT instance_name,
       inst_id,
       host_name,
       TO_CHAR(startup_time, 'YYYY/MM/DD HH24:MI:SS') AS startup_time,
       status,
       SYSDATE - STARTUP_TIME AS Running_Time
  FROM gv$instance
 ORDER BY inst_id;

-- -----------------------------------------------------------------------------------
-- SECTION: 联机重做日志信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">redo信息</h3>

/*
联机日志的组号、线程号、序列号、文件路径、大小、状态、归档状态、组内成员数量。
*/
PROMPT <p><a class="awr">数据库联机日志</a>

CLEAR COLUMNS
SET LINE 200
SET PAGES 1000

COL status FOR a30
COL member FOR a45
SELECT t2.member,
       t1.group#,
       t1.thread#,
       t1.sequence#,
       t1.bytes / 1024 / 1024 AS SIZE_MB,
       t1.status,
       t1.archived,
       t1.members
  FROM v$log t1, v$logfile t2
 WHERE t1.group# = t2.group#
 ORDER BY thread#, group#;

/*
(即可分析6天的波度,又可分析24小时内,可很容易看出异常情况)。
*/ 
PROMPT <p><a class="awr">最近7天中每天日志切换的量</a>

SELECT SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) Day,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'00',1,0)) H00,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'01',1,0)) H01, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'02',1,0)) H02,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'03',1,0)) H03,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'04',1,0)) H04,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'05',1,0)) H05,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'06',1,0)) H06,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'07',1,0)) H07,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'08',1,0)) H08,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'09',1,0)) H09,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'10',1,0)) H10,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'11',1,0)) H11, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'12',1,0)) H12,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'13',1,0)) H13, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'14',1,0)) H14,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'15',1,0)) H15, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'16',1,0)) H16, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'17',1,0)) H17, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'18',1,0)) H18, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'19',1,0)) H19, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'20',1,0)) H20, 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'21',1,0)) H21,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'22',1,0)) H22 , 
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'),10,2),'23',1,0)) H23, 
       COUNT(*) TOTAL 
  FROM v$log_history  a  
 WHERE first_time>=to_char(sysdate-7)
 GROUP BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) 
 ORDER BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'),1,5) DESC;
 
/*
日志切换频率分析(注意观察各行里first_time之间的时间差异会不会很短,很短就是切换过频繁)。
*/
PROMPT <p><a class="awr">日志切换频率分析</a>

SET LINE 200
SET PAGES 1000
SELECT thread#, 
       sequence#, 
       to_char(first_time, 'MM/DD/RR HH24:MI:SS')
  FROM v$log_history
 WHERE thread# = 1
   AND first_time > sysdate - 1
 ORDER BY first_time DESC;

SELECT thread#, 
       sequence#, 
       to_char(first_time, 'MM/DD/RR HH24:MI:SS')
  FROM v$log_history
 WHERE thread# = 2
   AND first_time > sysdate - 1
 ORDER BY first_time DESC;

-- -----------------------------------------------------------------------------------
-- SECTION: 归档日志信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">归档日志信息</h3>

/*
归档日志产生日期,产生的大小,和文件数等。
*/
PROMPT <p><a class="awr">查询归档产生状况</a>

SELECT lpad(to_char(first_time, 'yyyymmdd'), 12) "DATE",
       trunc(sum(blocks * block_size) / 1024 / 1024) "SIZE(MB)",
       count(*)
  FROM v$archived_log
 WHERE first_time > sysdate - 6
   AND creator = 'ARCH'
 GROUP BY lpad(to_char(first_time, 'yyyymmdd'), 12)
 ORDER BY 1;
 
 
-- -----------------------------------------------------------------------------------
-- SECTION: 空间使用
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">空间信息</h3>

/*
表空间名称,类型,区管理方式,区初始大小,段的管理方式,总大小,剩余大小,剩余百分比。
*/
PROMPT <p><a class="awr">数据库表空间信息</a>

CLEAR COLUMNS
COLUMN tablespace_name FOR a20 
COLUMN contents FOR a9 
COLUMN ext_mgmt FOR a12 
COLUMN alloc_type FOR a9 
COLUMN ext_kb FOR 9999999 
COLUMN ssm FOR a10 
COLUMN total_gb FOR 99999999.99 
COLUMN free_gb FOR 99999999.99
COLUMN free_pct FOR 999.99 
SET PAGES 100
SELECT c.tablespace_name,
       c.contents,
       c.extent_management ext_mgmt,
       c.allocation_type alloc_type,
       c.initial_extent / 1024 ext_kb,
       c.segment_space_management SSM,
       nvl(a.total_gb, 0) total_gb,
       decode(c.contents, 'TEMPORARY', nvl(a.total_gb, 0) - c.block_size * b.used_blocks / 1073741824, nvl(b.free_gb, 0)) free_gb,
       100 *  decode(c.contents, 'TEMPORARY', nvl(a.total_gb, 0) - c.block_size * b.used_blocks / 1073741824, nvl(b.free_gb, 0)) / nvl(a.total_gb, 1) free_pct
  FROM (SELECT tablespace_name, nvl(sum(bytes) / 1073741824, 0) total_gb
          FROM dba_data_files
         GROUP BY tablespace_name
         UNION all
        SELECT tablespace_name, nvl(sum(bytes) / 1073741824, 0) total_gb
          FROM dba_temp_files
         GROUP BY tablespace_name) a,
       (SELECT tablespace_name,
               nvl(sum(bytes) / 1073741824, 0) free_gb,
               0 used_blocks
          FROM dba_free_space
         GROUP BY tablespace_name
         UNION all
        SELECT tablespace_name,
               0 free_gb,
               nvl(sum(used_blocks), 0) used_blocks
          FROM gv$sort_segment
         GROUP BY tablespace_name) b,
       dba_tablespaces c
 WHERE c.tablespace_name = b.tablespace_name(+)
   AND c.tablespace_name = a.tablespace_name(+)
 ORDER BY c.contents, free_pct, c.tablespace_name;


PROMPT <p><a class="awr">数据库表空间与数据文件</a>
/*
表空间文件对应的表空间,是否自动扩展,当前值,最大值。
*/
SET LINE 200
SET PAGES 100

COLUMN file_name FOR A65
COLUMN tablespace_name FOR A30
SELECT file_id,
       file_name,
       tablespace_name,
       autoextensible,
       bytes / 1073741824 as current_gb,
       maxbytes / 1073741824 as max_gb
  FROM dba_data_files
 UNION
SELECT file_id,
       file_name,
       tablespace_name as ts_name,
       autoextensible,
       bytes / 1073741824 as cur_gb,
       maxbytes / 1073741824 as max_gb
  FROM dba_temp_files
 ORDER BY tablespace_name, file_id, file_name;

-- -----------------------------------------------------------------------------------
-- SECTION: 存储信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">ASM信息</h3>

/*
磁盘组的序号、名称、冗余类型、总大小、剩余大小、使用频繁的数据大小、镜像需要的空间、可用于镜像的大小、offline的磁盘数量。
*/
PROMPT <p><a class="awr">查询ASM磁盘组信息</a>

CLEAR COLUMNS
COLUMN group_number FOR 9999999999
COLUMN name FOR A20
COLUMN type FOR A20
COLUMN total_mb FOR 9999999999
COLUMN free_mb FOR 9999999999
COLUMN hot_used_mb FOR 9999999999
COLUMN required_mirror_free_mb FOR 9999999999
COLUMN usable_file_mb FOR 9999999999
COLUMN offline_disks FOR 9999999999
SELECT group_number,
       name,
       type,
       total_mb,
       free_mb,
       hot_used_mb,
       required_mirror_free_mb,
       usable_file_mb,
       offline_disks 
  FROM v$asm_diskgroup_stat;

/*
磁盘组对应磁盘的信息。
*/
PROMPT <p><a class="awr">查询ASM磁盘组存储信息</a>

CLEAR COLUMNS
COLUMN g_number FOR 9999
COLUMN d_number FOR 9999
COLUMN g_name FOR A8
COLUMN d_name FOR A15
COLUMN g_total_mb FOR 99999999
COLUMN d_total_mb FOR 99999999
COLUMN path FOR A30
COLUMN h_status FOR A8
COLUMN g_free_mb FOR 99999999
COLUMN d_free_mb FOR 99999999
SELECT g.group_number g_number,
       g.name g_name,
       g.total_mb g_total_mb,
       g.free_mb g_free_mb,
       g.state g_state,
       d.disk_number d_number,
       d.name d_name,
       d.path,
       d.total_mb d_total_mb,
       d.free_mb d_free_mb,
       d.header_status h_status,
       d.mode_status m_status
  FROM v$asm_diskgroup_stat g, v$asm_disk_stat d
 WHERE g.group_number = d.group_number
 ORDER BY g.group_number, d.disk_number;

-- -----------------------------------------------------------------------------------
-- SECTION: 闪回区信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">闪回信息</h3>

/*
文件类型、空间使用百分比、可回收百分比、文件的数量。
*/
PROMPT <p><a class="awr">闪回区空间使用状况</a>

CLEAR COLUMNS
SET LINE 100
SELECT * 
  FROM v$flash_recovery_area_usage; 

 
-- -----------------------------------------------------------------------------------
-- SECTION: 备份信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">数据库备份信息</h3>

/*
会话信息标识,备份内容,开始时间,结束时间,消耗时间。
*/
PROMPT <p><a class="awr">查询备份信息</a>

CLEAR COLUMNS
SET PAGES 1000
COLUMN time_taken_display FOR a10
COLUMN start_time FOR a20
COLUMN end_time FOR a20
COLUMN status FOR a15
COLUMN tt  FOR a10
SELECT session_key,
       start_time,
       end_time,
       status,
       time_taken_display tt
  FROM v$rman_backup_job_details
 WHERE start_time > sysdate -1
 ORDER BY session_key;

-- -----------------------------------------------------------------------------------
-- SECTION: 会话信息
-- -----------------------------------------------------------------------------------
PROMPT <p><h3 class="awr">session情况</h3>

/*
查询登录异常的会话。
*/
PROMPT <p><a class="awr">登录时间最长的会话</a>

SET LINE 500
CLEAR COLUMNS
COLUMN spid for a15
COLUMN sql_id for a20
COLUMN program for a40
COLUMN event for a35
SELECT *
  FROM (SELECT t.sid,
               t.serial#,
               trunc(sysdate - logon_time) AS online_time,
               t.PROGRAM,
               t.status,
               t.LOGON_TIME,
               t.sql_id,
               t.prev_sql_id,
               t.event
          FROM gv$session t
         WHERE t.type <> 'BACKGROUND' AND program is not null
         ORDER BY logon_time)
 WHERE rownum <= 30;

PROMPT <p><a class="awr">查看进程资源限制</a>
/*
资源所在的实例,资源的名称,当前大小,上次启动以来的最大值,初始大小
*/
CLEAR COLUMNS
SET PAGES 100
SET LINE 200
COLUMN resource_name FOR a30
COLUMN current_utilization FOR 9999999
COLUMN max_utilization FOR 9999999
COLUMN initial_allocation FOR a18
COLUMN limit_value FOR a15
SELECT * 
  FROM gv$resource_limit  
 WHERE trim(limit_value) != 'UNLIMITED';

-- -----------------------------------------------------------------------------------
-- SECTION: 等待事件
-- -----------------------------------------------------------------------------------
prompt <a class="awr">24小时内CPU等待最长的事件</a>
CLEAR COLUMNS
SET PAGES 100
SET LINE 800
COLUMN user_id for a45
COLUMN SQL_ID for a30
COLUMN Event for a45
COLUMN sql_id for a45
COLUMN user_id for a45
COLUMN cnt for a45
COLUMN machine for a45
COLUMN program for a45
SELECT t.*, s.sid, s.serial#, s.machine, s.program, s.osuser
  FROM (SELECT c.USERNAME, a.event, to_char(a.cnt) as seconds, a.sql_id, dbms_lob.substr(b.sql_fulltext, 100, 1) sqltext
          FROM (SELECT rownum rn, t.*
                  FROM (SELECT decode(s.session_state, 'WAITING', s.event, 'Cpu + Wait For Cpu') Event, s.sql_id, s.user_id, count(*) cnt
                          FROM v$active_session_history s
                         WHERE sample_time > sysdate - 1
                         GROUP BY s.user_id, decode(s.session_state, 'WAITING', s.event, 'Cpu + Wait For Cpu'), s.sql_id
                         ORDER BY cnt DESC) t
                 WHERE rownum < 20) a, v$sqlarea b, dba_users c
         WHERE a.sql_id = b.sql_id AND a.user_id = c.user_id
         ORDER BY cnt DESC) t, v$session s
 WHERE t.sql_id = s.sql_id(+);
 
-- -----------------------------------------------------------------------------------
-- SECTION: 内存信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">04031报错</h3>
PROMPT <p><a class="awr">04031报错(出错时,检查shared_pool_reserved池是否有报错)</a>
SELECT free_space,
       avg_free_size,
       used_space,
       avg_used_size,
       request_failures,
       last_failure_size 
  FROM v$shared_pool_reserved;

-- -----------------------------------------------------------------------------------
-- SECTION: SQL信息
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">当前SQL</h3>
PROMPT <p><a class="awr">执行次数最多</a>
SET LINE 300
SELECT *
  FROM (SELECT sql_id, sql_text, s.executions, s.last_load_time, s.first_load_time, s.disk_reads, s.buffer_gets, s.parse_calls
          FROM v$sql s
      ORDER BY s.executions DESC)
 WHERE rownum <= 10;

PROMPT <p><a class="awr">解析次数最多</a>
SET LINE 300
SELECT *
  FROM (SELECT sql_id, sql_text, s.EXECUTIONS, s.LAST_LOAD_TIME, s.FIRST_LOAD_TIME, s.DISK_READS, s.BUFFER_GETS, s.PARSE_CALLS
          FROM v$sql s
      ORDER BY s.PARSE_CALLS DESC)
 WHERE rownum <= 10;

 
-- -----------------------------------------------------------------------------------
-- SECTION: ADG
-- -----------------------------------------------------------------------------------
PROMPT <h3 class="awr">ADG信息</h3>

/*
归档目标名称、状态、数据库当前模式、目的地路径
*/
PROMPT <p><a class="awr">归档目的地状态</a>

column dest_name FOR a20
column status FOR a10
column database_mode FOR a20
column destination FOR a20
SELECT dest_name,
       status,
       database_mode,
       destination  
  FROM v$archive_dest_status 
 WHERE dest_id in ('1','2');

/*
每个节点,应用日志的最大号,和scn最大值和最小值
*/
PROMPT <p><a class="awr">查询最大的应用日志号</a>

CLEAR COLUMNS
SET LINE 150
SELECT m.thread#, 
       m.sequence#, 
       first_change#, 
       next_change#
  FROM v$log_history m,
       (SELECT thread#, max(sequence#) as sequence#
          FROM v$log_history
      GROUP BY thread#) t
 WHERE m.thread# = t.thread#
   AND m.sequence# = t.sequence#;
   
/*
查询每个节点最大的归档日志号
*/
PROMPT <p><a class="awr">查询最大的归档日志号</a>

SET LINE 100
SELECT UNIQUE thread# AS thread, 
       MAX(sequence#) OVER (PARTITION BY thread#) AS last 
  FROM v$archived_log;
/*
PROMPT <p><a class="awr">查询GAP</a>
备库缺少日志的起始号码
clear columns
set line 200
SELECT THREAD#,LOW_SEQUENCE#,HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;
*/

PROMPT <br><a class="awr" href="#top">Back to Top</a>
PROMPT <p>End of Report</p>
PROMPT <p></body></html>
SPOOL OFF
exit;
#!/bin/bash
#----------------------------------------------------------------------------------------------------+
#                      HAIZDL@126.COM                                                                |
#----------------------------------------------------------------------------------------------------|
#      COPYRIGHT (C) 1998-2015 HAIZDL ALL RIGHTS RESERVED.                                         |
#----------------------------------------------------------------------------------------------------|
# DATABASE : ORACLE                                                                                  |
# FILE     : healthyCheck1_forLinux.sh                                                                          |
# CLASS    : DATABASE ADMINISTRATION                                                                 |
# PURPOSE  : COLLECTING LOGS FOR ORACLE DATABASE DAILY CHECKING.                                     |
# NOTE     : AS WITH ANY CODE, ENSURE TO TEST THIS SCRIPT IN A DEVELOPMENT                           |
#            ENVIRONMENT BEFORE ATTEMPTING TO RUN IT IN PRODUCTION.                                  |
#----------------------------------------------------------------------------------------------------+

#----------------------------------------------
#CHECK JOB OPERATOR USER INFO.
#----------------------------------------------
function _userVerify(){
    USER=`whoami`
    if [ "$USER" != "root" ]; then
        echo "PLS USE ROOT USER FOR THIS JOB."
        echo "USAGE: $0"
        exit 1
    fi
}

#----------------------------------------------
#BASIC VARIABLE.
#----------------------------------------------
HOST=`hostname`
PWDDIRECTORY=/tmp/ora
DATESTRING=`date '+%Y%m%d'`
LOGDIR=$PWDDIRECTORY/$DATESTRING$HOST
ASMSID=+ASM1

#LOG DIRECTORY PARAMETER
OINSTALLDIRECTORY=/home/u01/app
RDBMSDIRECTORY=$OINSTALLDIRECTORY/oracle/diag/rdbms
GRIDLOGDIRECTORY=$OINSTALLDIRECTORY/11.2.0.4/grid/log
GRIDDIRECTORY=$OINSTALLDIRECTORY/grid

ALERTOFCLUSTER=$GRIDLOGDIRECTORY/$HOST/alert$HOST.log
ALERTOFLISTENER=$GRIDDIRECTORY/diag/tnslsnr/$HOST/listener/trace/listener.log
ALERTOFLISTENERSCAN=$GRIDLOGDIRECTORY/diag/tnslsnr/$HOST/listener_scan1/trace/listener_scan1.log
ALERTOFASM=$GRIDDIRECTORY/diag/asm/+asm/$ASMSID/trace/alert_$ASMSID.log

OPATCHORACLEDIR=$OINSTALLDIRECTORY/oracle/product/11.2.0.4/dbhome_1/OPatch
OPATCHGRIDDIR=$OINSTALLDIRECTORY/11.2.0.4/grid/OPatch

#DIAGNOSTIC FILES
LOGOFOSINFO=$LOGDIR/os/info_os_`date '+%y%m%d%H%M'`
LOGOFCLUSTER=$LOGDIR/cluster/info_cluster_`date '+%y%m%d%H%M'`
DBFILE=$LOGDIR/dbs.tmp

#----------------------------------------------
#CREATE LOG INFO DIRECTORY.
#----------------------------------------------
function _initialization(){
    if [ -d $LOGDIR ]; then
            echo "INIT: LOG DIRECTORY EXISTS, DELETING."
            rm -rf $LOGDIR
    fi
    
    echo "INIT: LOG DIRECTORY CREATING."
    mkdir $LOGDIR
    mkdir $LOGDIR/os
    mkdir $LOGDIR/cluster
    mkdir $LOGDIR/db
    chmod -R 777 $LOGDIR
    
    if [ $? -ne 0 ]; then
            echo "INIT: LOG DIRECTORY CREATING ERROR!"
            exit 1
    fi
}

#----------------------------------------------------------------------------------------------------|
#      1. CLUSTER LOG.                                                                               |
#----------------------------------------------------------------------------------------------------|
function _clusterlog(){

    cd $PWDDIRECTORY
    if [ -e $ALERTOFCLUSTER ]; then
            ALERTOFCLUSTERNUM=$(($(awk '{print NR}' $ALERTOFCLUSTER |tail -n1) - $(cat -n $ALERTOFCLUSTER| grep "`date -d last-day +%Y-%m-%d|cut -c 1-10`" | awk 'BEGIN {min=99999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
            if [ $ALERTOFCLUSTERNUM -gt 0 ]; then
                tail -$ALERTOFCLUSTERNUM $ALERTOFCLUSTER >>$LOGDIR/cluster/alert$HOST_`date '+%y%m%d'`.log
                if [ $? -ne 0 ]; then
                    echo "   JOB-1-1: CLUSTER ALERT LOG COPY ERROR!"
                    exit 1
                fi
            fi
    fi
      
    if [ -e $ALERTOFLISTENER ]; then
            ALERTOFLISTENERNUM=$(($(awk '{print NR}' $ALERTOFLISTENER |tail -n1) - $(cat -n $ALERTOFLISTENER| grep "`date -d last-day "+%d-%^b-%Y"`" | awk 'BEGIN {min=9999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
            if [ $ALERTOFLISTENERNUM -gt 0 ]; then
                tail -$ALERTOFLISTENERNUM $ALERTOFLISTENER >>$LOGDIR/cluster/alert_listener_$HOST_`date '+%y%m%d'`.log
                if [ $? -ne 0 ]; then
                    echo "   JOB-1-2: CLUSTER ALERT LOG COPY ERROR!"
                    exit 1
                fi
            fi
    fi
    
    if [ -e $ALERTOFLISTENERSCAN ]; then
            ALERTOFLISTENERSCANNUM=$(($(awk '{print NR}' $ALERTOFLISTENERSCAN |tail -n1) - $(cat -n $ALERTOFLISTENERSCAN| grep "`date -d last-day "+%d-%^b-%Y"`" | awk 'BEGIN {min=9999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
            if [ $ALERTOFLISTENERSCANNUM -gt 0 ]; then
                tail -$ALERTOFLISTENERSCANNUM $ALERTOFLISTENERSCAN >>$LOGDIR/cluster/alert_listener_scan1_`date '+%y%m%d'`.log
                if [ $? -ne 0 ]; then
                    echo "   JOB-1-3: CLUSTER ALERT LOG COPY ERROR!"
                    exit 1
                fi
            fi
    fi
    
    if [ -e $ALERTOFASM ]; then
            ALERTOFASMNUM=$(($(awk '{print NR}' $ALERTOFASM |tail -n1) - $(cat -n $ALERTOFASM| grep "`date -d last-day|cut -c 1-8``date -d last-day +%d`" | awk 'BEGIN {min=9999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
            if [ $ALERTOFASMNUM -gt 0 ]; then
                tail -$ALERTOFASMNUM $ALERTOFASM >>$LOGDIR/cluster/alert_$ASMSID_`date '+%y%m%d'`.log
                if [ $? -ne 0 ]; then
                    echo "   JOB-1-4: CLUSTER ALERT LOG COPY ERROR!"
                    exit 1
                fi
            fi
    fi
    
    echo "                                                               " >>$LOGOFCLUSTER
    echo "                                                               " >>$LOGOFCLUSTER
}

#----------------------------------------------------------------------------------------------------|
#      2. OS INFO.                                                                                   |
#----------------------------------------------------------------------------------------------------|
function _osinfo(){

    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- UPTIME:"                                                       >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    uptime                                                                 >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO
    
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- VMSTAT:"                                                       >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    vmstat 1 5                                                             >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO 
    
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- POWERMT:"                                                      >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    powermt display dev=all                                                >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO 
    
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- FREE:"                                                         >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    free -m                                                                >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO 
    
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- DF:"                                                           >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    df -h                                                                  >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO
    
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- SAR-PAGE:"                                                     >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    sar -B 1 5                                                             >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO 
    
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- SAR-CPU:"                                                      >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    sar -u 1 5                                                             >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO 

    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- IOSTAT:"                                                       >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    iostat -mx 1 5 | grep power                                            >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO 
    
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    echo "- SAR-NETWORK:"                                                  >>$LOGOFOSINFO
    echo "---------------------------------------------------------------" >>$LOGOFOSINFO
    sar -n DEV 1 5 | grep bond                                             >>$LOGOFOSINFO
    echo "                                                               " >>$LOGOFOSINFO 

    if [ $? -ne 0 ]; then
        echo "   JOB-2-1: OS INFO ERROR!"
        exit 1
    fi

    LOGOFSYS=/var/log/messages
    if [ -e $LOGOFSYS ]; then
            cp $LOGOFSYS $LOGDIR/os
            if [ $? -ne 0 ]; then
                echo "   JOB-2-2: OS INFO ERROR!"
                exit 1
            fi
    fi
}

#----------------------------------------------------------------------------------------------------|
#      3. CLUSTER INFO.                                                                              |
#----------------------------------------------------------------------------------------------------|
function _clusterinfo(){

    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "- 查看集群资源状态:"                                             >>$LOGOFCLUSTER
    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "                                                               " >>$LOGOFCLUSTER
    su - grid -c "crsctl status res -t "                                   >>$LOGOFCLUSTER
    echo "                                                               " >>$LOGOFCLUSTER

    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "- 查看集群服务状态:"                                             >>$LOGOFCLUSTER
    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "                                                               " >>$LOGOFCLUSTER
    su - grid -c "crsctl check cluster -all"                               >>$LOGOFCLUSTER
    echo "                                                               " >>$LOGOFCLUSTER

    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "- 查看集群监听状态:"                                             >>$LOGOFCLUSTER
    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "                                                               " >>$LOGOFCLUSTER
    echo "#1 本地监听                                                     ">>$LOGOFCLUSTER
    su - grid -c "lsnrctl status"                                          >>$LOGOFCLUSTER
    echo "#2 SCAN监听                                                    " >>$LOGOFCLUSTER
    su - grid -c "lsnrctl status LISTENER_SCAN1"                           >>$LOGOFCLUSTER
    echo "                                                               " >>$LOGOFCLUSTER
    
    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "- 查看磁盘组信息及状态                                          ">>$LOGOFCLUSTER
    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "- #1                                                            ">>$LOGOFCLUSTER
    su - grid -c "asmcmd lsdg"                                             >>$LOGOFCLUSTER

    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "- 查看OCR信息                                                   ">>$LOGOFCLUSTER
    echo "---------------------------------------------------------------" >>$LOGOFCLUSTER
    echo "- #1 心跳盘                                                     ">>$LOGOFCLUSTER
    su - grid -c "crsctl query css votedisk"                               >>$LOGOFCLUSTER
    echo "- #2 一致性检查                                                " >>$LOGOFCLUSTER
    su - grid -c "ocrcheck"                                                >>$LOGOFCLUSTER
    echo "                                                               " >>$LOGOFCLUSTER
}

#----------------------------------------------------------------------------------------------------|
#      4. DB INFO.                                                                                   |
#----------------------------------------------------------------------------------------------------|
function _dbinfo(){
cd $PWDDIRECTORY
su - oracle -c "cd $LOGDIR/db;export ORACLE_SID=$1;sqlplus -s /nolog <<EOF
conn / as sysdba;
@$LOGDIR/../dailyhealthycheck_withadg.sql
exit;
EOF"
}

#----------------------------------------------------------------------------------------------------|
#      5. DB LOG.                                                                                    |
#----------------------------------------------------------------------------------------------------|
function _dblog(){
    cd $PWDDIRECTORY
    DBNAME2=$1
    DBNAME2=${DBNAME2%?}
    ALERTOFDB=$RDBMSDIRECTORY/$DBNAME2/$1/trace/alert_$1.log
    if [ -e $ALERTOFDB ]; then
        ALERTOFDBNUM=$(($(awk '{print NR}' $ALERTOFDB |tail -n1) - $(cat -n $ALERTOFDB| grep "`date -d last-day|cut -c 1-8``date -d last-day +%d`" | awk 'BEGIN {min=9999999; max=-1;} {if(min>$1) min=$1; if(max<$1) max=$1;} END {printf("%d", min);}') +1))
        if [ $ALERTOFDBNUM -gt 0 ]; then
            tail -$ALERTOFDBNUM $ALERTOFDB >>$LOGDIR/db/alert_$1_`date '+%y%m%d_%H:%M:%S'`.log

            if [ $? -ne 0 ]; then
                echo "   JOB-5: DB LOG ERROR!"
                exit 1
            fi
        fi
    fi

}

#----------------------------------------------------------------------------------------------------|
#      6. RMAN INFO.                                                                                 |
#----------------------------------------------------------------------------------------------------|
function _rmaninfo(){
cd $PWDDIRECTORY
LOGOFRMAN=$LOGDIR/rman_$1-`date '+%y%m%d_%H:%M:%S'`.log
su - oracle -c "export ORACLE_SID=$1;rman target / <<EOF >$LOGOFRMAN
show all;
list backup of database summary;
exit;
EOF"

    if [ $? -ne 0 ]; then
        echo "JOB-6: RMAN INFO ERROR!"
        exit 1
    fi
}

#----------------------------------------------------------------------------------------------------|
#      7. AWR.                                                                                       |
#----------------------------------------------------------------------------------------------------|
function _awrinfo(){
cd $PWDDIRECTORY
su - oracle -c "cd $LOGDIR/db;mkdir ./$1AWR;cd $1AWR;export ORACLE_SID=$1;sqlplus -silent /nolog <<EOF
conn /as sysdba
@$LOGDIR/../makeawr.sql;
exit;
EOF"

sed -i "s/PL\\/SQL procedure successfully completed.//g" $LOGDIR/$1AWR/awrsql.sql

su - oracle -c "cd $LOGDIR;cd ./$1AWR;export ORACLE_SID=$1;sqlplus -silent /nolog <<EOF
conn /as sysdba
set echo off 
@awrsql.sql;
exit;
EOF"

    if [ $? -ne 0 ]; then
        echo "JOB7-5: AWR ERROR!"
        exit 1
    fi
}

#----------------------------------------------------------------------------------------------------|
#      8. TAR.                                                                                       |
#----------------------------------------------------------------------------------------------------|
function _logtar(){
cd $PWDDIRECTORY

ls $LOGDIR/db/*.html > $LOGDIR/html.tmp 
cat $LOGDIR/html.tmp | while read FNAME
do
    sed -i 's/GBK/UTF8/g' $FNAME
done
rm -rf $LOGDIR/html.tmp

rm -f *$HOST.tar.gz 
    if [ -d $DATESTRING$HOST ]; then
        tar -czvf $DATESTRING-$HOST.tar.gz ./$DATESTRING$HOST
        rm -rf ./$DATESTRING$HOST
    else
        echo "   JOB-8: TAR ERROR."
    fi
}

#----------------------------------------------------------------------------------------------------|
#      MAIN().                                                                                       |
#----------------------------------------------------------------------------------------------------|
echo "========================================================="
echo "= JOB-NAME: $0"
echo "========================================================="
_userVerify
_initialization

echo "----------------------------------------------------"
echo "- JOB-1: CLUSTER LOG."
echo "----------------------------------------------------"
_clusterlog

echo "----------------------------------------------------"
echo "- JOB-2: OS INFO."
echo "----------------------------------------------------"
_osinfo

echo "----------------------------------------------------"
echo "- JOB-3: CLUSTER INFO."
echo "----------------------------------------------------"
_clusterinfo

echo "----------------------------------------------------"
echo "- JOB-(4-7): DB JOBs."
echo "----------------------------------------------------"
ps -ef | grep ora_pmon | grep -v grep | awk '{print $8}' | awk -F "_" '{print $3}' >$DBFILE
if [ -e $DBFILE ]; then
    cat $DBFILE | while read DBSID
       do
            echo "- # JOB-4: DB INFO."
            _dbinfo $DBSID
            echo "- # JOB-5: DB LOG."
            _dblog $DBSID
       done
else
    echo "   JOB-4-7: NO DBSID FILES ERROR."
fi
rm -rf $DBFILE
   
echo "----------------------------------------------------"
echo "- JOB-8: LOG TAR."
echo "----------------------------------------------------"
_logtar

如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!

14

添加新评论2 条评论

匿名用户
2023-03-17 20:39
很不错
redflagfleetredflagfleet网络工程师医院
2023-03-17 15:41
很好的技术内容,学习了。感谢分享。
Ctrl+Enter 发表

本文隶属于专栏

最佳实践
不同的领域,都有先行者,实践者,用他们的最佳实践来加速更多企业的建设项目落地。

作者其他文章

相关文章

相关问题

相关资料

X社区推广