Tuesday, January 15, 2013

Update your stale statistics

Object statistics are very important for your data warehouse. Tables and indexes must have updated statistics in order for the query optimizer to choose the best execution plan for each query. So you must continously gather objects statistics (especially after ETL jobs have modified significantly the data).

The easiest way to do this is to utilize Oracle's automatic statistics gathering task. This can be easily configured to run at a specific maintenance window (that you will define, unless you want to use the default) from Enterprise Manager (Administration --> Automated Maintenance Tasks --> Optimizer Statistics Gathering).

Note that you should use an appropriate degree of parallelism so that the stats gathering process runs fast enough to fit in the maintenance window. You may have to experiment a bit with this, in order to find what is appropriate for your case.

But how can you monitor if the automated task has succesfully managed to gather statistics in the defined maintenance window?


With the following script we can monitor the executions of the Oracle automated task:


nkarag@DWHPRD> host cat stats_check.sql
set linesize 300
col JOB_DURATION format a13
col JOB_ERROR format 999
col JOB_INFO format a25 wrap

select JOB_NAME, JOB_STATUS, JOB_START_TIME, JOB_DURATION, JOB_ERROR, JOB_INFO

from DBA_AUTOTASK_JOB_HISTORY
where client_name='auto optimizer stats collection'
order by window_start_time desc;


For example:

nkarag@DWHPRD> @stats_check


JOB_NAME JOB_STATUS JOB_START_TIME JOB_DURATION JOB_ERROR JOB_INFO
ORA$AT_OS_OPT_SY_1685 STOPPED 14/1/2013 4:00:01.769209 PM +02:00 +00 07:59:59.000000 0   REASON="Stop job called because associated window was closed"
ORA$AT_OS_OPT_SY_1665 SUCCEEDED 13/1/2013 4:00:08.762606 PM +02:00 +00 04:40:15.000000 0
ORA$AT_OS_OPT_SY_1607 SUCCEEDED 12/1/2013 4:00:02.320778 PM +02:00 +00 04:49:59.000000 0

In the previous example, we can see 3 executions of the task: two succesfull ones and one that was stopped. The stopped means that the 8-hour window (check the duration column) available for statistics gathering was not enough.

So, we know that some objects will have stale statistics. In such a case we can manually collect statistics for the stale statistics objects. But how can we know which are the objects with the stale statistics?

With the following script we can spool out a set of dbms_stats.gather_table_stats commands just for the objects with stale statistics. The scipt prompts us for a parallel degree for the statistics collection process. Here it is:



nkarag@DWHPRD> host cat stats_gath_gen.sql
rem -----------------------------------------------------------------------------------------
rem     stats_gath_gen.sql      Spools into file "gather_stats.spool", gather table stats commands rem     for all tables with stale stats.
rem -----------------------------------------------------------------------------------------
rem set serveroutput on

DECLARE
ObjList dbms_stats.ObjectTab;
l_cmd   varchar2(4000);
BEGIN
execute immediate 'drop table nkarag.stale_stats';
execute immediate 'create table nkarag.stale_stats(gendate date, cmd varchar2(4000))';
dbms_stats.gather_database_stats(objlist=>ObjList, options=>'LIST STALE');
FOR i in ObjList.FIRST..ObjList.LAST
LOOP
        if (ObjList(i).ObjType = 'TABLE') then
                l_cmd := 'exec dbms_stats.gather_table_stats(ownname=>'''||ObjList(i).ownname ||    
                ''', tabname=>''' || ObjList(i).ObjName || ''')';
                insert into nkarag.stale_stats values(sysdate, l_cmd);
    end if;
--dbms_output.put_line('exec dbms_stats.gather_table_stats(ownname=>'''||ObjList(i).ownname || ''', tabname=>''' || ObjList(i).ObjName || ''')');
END LOOP;
commit;
END;
/


accept degree prompt "Enter value for degree:"
set heading off
set feedback off
set verify off
set termout off
spool gather_stats.spool

select distinct replace(cmd, ')', ',degree => &degree )') from nkarag.stale_stats order by 1
/

spool off
set heading on
set feedback on
set verify on
set termout on


Of course, you can modify the script to fit your own needs. Here is a small extract from the spoolled file:


exec dbms_stats.gather_table_stats(ownname=>'REPORT_DW', tabname=>'CR1550_AGG',degree => 64 )
exec dbms_stats.gather_table_stats(ownname=>'REPORT_DW', tabname=>'CR1551_AGG',degree => 64 )
exec dbms_stats.gather_table_stats(ownname=>'REPORT_DW', tabname=>'CR888_OTE_BDP_AGG',degree => 64 )
...


Enjoy!

Cheers,
OL

No comments:

Post a Comment