Friday, January 11, 2008

Oracle DBMS_STATS ???

What the hell is that? I had never interested it. I was wrong.

When I have created tables or indexes, I'll think that's OK.

"Of Course not". When I have used EM to help creating. I'll know Oracle optimize statistics on tables or indexes after create.
Why?

Optimizer statistics are a collection of data that describe more details about the database and the objects in the database. These statistics are used by the query optimizer to choose the best execution plan for each SQL statement.

The optimizer statistics are stored in the data dictionary.

AnyWay I should understand Cost Based Optimizer (CBO) and Database Statistics.

Whenever a valid SQL statement is processed Oracle has to decide how to retrieve the necessary data. This decision can be made using one of two methods:

- Rule Based Optimizer (RBO) - This method is used if the server has no internal statistics relating to the objects referenced by the statement. This method is no longer favoured by Oracle and will be desupported in future releases.

- Cost Based Optimizer (CBO) - This method is used if internal statistics are present. The CBO checks several possible execution plans and selects the one with the lowest cost, where cost relates to system resources.
If new objects are created, or the amount of data in the database changes the statistics will no longer represent the real state of the database so the CBO decision process may be seriously impaired.

The mechanisms :
Analyze Statement
The ANALYZE statement can be used to gather statistics for a specific table, index or cluster. The statistics can be computed exactly, or estimated based on a specific number of rows, or a percentage of rows:
SQL> ANALYZE TABLE emp COMPUTE STATISTICS;
SQL> ANALYZE INDEX emp_pk COMPUTE STATISTICS;
SQL> ANALYZE TABLE emp ESTIMATE STATISTICS SAMPLE 1000 ROWS;
SQL> ANALYZE TABLE emp ESTIMATE STATISTICS SAMPLE 10 PERCENT;

DBMS_UTILITY
The DBMS_UTILITY package can be used to gather statistics for a whole schema or database. Both methods follow the same format as the analyze statement:
SQL> EXEC DBMS_UTILITY.analyze_schema('TEST','COMPUTE');
SQL> EXEC DBMS_UTILITY.analyze_schema('TEST','ESTIMATE', estimate_rows => 1000);
SQL> EXEC DBMS_UTILITY.analyze_schema('TEST','ESTIMATE', estimate_percent => 10);
SQL> EXEC DBMS_UTILITY.analyze_database('COMPUTE');
SQL> EXEC DBMS_UTILITY.analyze_database('ESTIMATE', estimate_rows => 1000);
SQL> EXEC DBMS_UTILITY.analyze_database('ESTIMATE', estimate_percent => 10);

DBMS_STATS
The DBMS_STATS package was introduced in Oracle 8i and is Oracles preferred method of gathering object statistics. Oracle list a number of benefits to using it including parallel execution, long term storage of statistics and transfer of statistics between servers. Once again, it follows a similar format to the other methods
SQL> EXEC DBMS_STATS.gather_database_stats;
SQL> EXEC DBMS_STATS.gather_database_stats(estimate_percent => 10);
SQL> EXEC DBMS_STATS.gather_schema_stats('TEST');
SQL> EXEC DBMS_STATS.gather_schema_stats('SCOTT', estimate_percent => 10);
SQL> EXEC DBMS_STATS.gather_table_stats('TEST', 'EMP');
SQL> EXEC DBMS_STATS.gather_table_stats('TEST', 'EMP', estimate_percent => 10);
SQL> EXEC DBMS_STATS.gather_index_stats('TEST', 'EMP_PK');
SQL> EXEC DBMS_STATS.gather_index_stats('TEST', 'EMP_PK', estimate_percent => 10);

This package also gives you the ability to delete statistics:
SQL> EXEC DBMS_STATS.delete_database_stats;
SQL> EXEC DBMS_STATS.delete_schema_stats('TEST');
SQL> EXEC DBMS_STATS.delete_table_stats('TEST', 'EMP');
SQL> EXEC DBMS_STATS.delete_index_stats('TEST', 'EMP_PK');

Scheduling Stats
Scheduling the gathering of statistics using DBMS_Job is the easiest way to make sure they are always up to date:
SET SERVEROUTPUT ON
DECLARE
l_job NUMBER;
BEGIN
DBMS_JOB.submit(l_job,
'BEGIN DBMS_STATS.gather_schema_stats(''TEST''); END;',
SYSDATE,
'SYSDATE + 1');
COMMIT;
DBMS_OUTPUT.put_line('Job: ' || l_job);
END;
/

The above code sets up a job to gather statistics for SCOTT for the current time every day. You can list the current jobs on the server using the DBS_JOBS and DBA_JOBS_RUNNING views.

Existing jobs can be removed using:
EXEC DBMS_JOB.remove(X);
COMMIT;

Where 'X' is the number of the job to be removed.

Transfering Stats

It is possible to transfer statistics between servers allowing consistent execution plans between servers with varying amounts of data. First the statistics must be collected into a statistics table. In the following examples the statistics for the APPSCHEMA user are collected into a new table, STATS_TABLE, which is owned by DBASCHEMA:
SQL> EXEC DBMS_STATS.create_stat_table('DBASCHEMA','STATS_TABLE');
SQL> EXEC DBMS_STATS.export_schema_stats('APPSCHEMA','STATS_TABLE',NULL,'DBASCHEMA');

This table can then be transfered to another server using your preferred method (Export/Import, SQLPlus Copy etc.) and the stats imported into the data dictionary as follows:
SQL> EXEC DBMS_STATS.import_schema_stats('APPSCHEMA','STATS_TABLE',NULL,'DBASCHEMA');
SQL> EXEC DBMS_STATS.drop_stat_table('DBASCHEMA','STATS_TABLE');

What the mechanism do We use to Analyze?
We should "Use the dbms_stats, Ignore the dbms_utility"


If 10G.
With Oracle Database 10g, there are some new arguments available for the dbms_stats package subprograms. Those parameters are granularity and degree.

granularity
This parameter is used in subprograms such as gather_table_stats and gather_schema_stats. This parameter indicates the granularity of the statistics that you want to collect, particularly for partitioned tables. As an example, you can gather the global statistics on a partitioned table, or you can gather global and partition-level statistics. It has two options. They are: AUTO and GLOBAL AND PARTITION.
- When the AUTO option is specified, the procedure determines the granularity based on the partitioning type. Oracle collects global, partition-level, and sub-partition level statistics if sub-partition method is LIST. For other partitioned tables, only the global and partition level statistics are generated.
- When the GLOBAL AND PARTITION option is specified, Oracle gathers the global and partition level statistics. No sub-partition level statistics are gathered even it is composite partitioned object.

degree
With this parameter, you are able to specify the degree of parallelism. In general, the degree parameter allows you to parallelize the statistics gathering process. The degree parameter can take the value of auto_degree.

When you specify the auto_degree, Oracle will determine the degree of parallelism automatically. It will be either 1 (serial execution) or default_degree (the system default value based on number of CPUs and initialization parameters), according to the size of the object. Take care if Hyper Threading is used, as you will have less computational power than Oracle assumes.

10g DML Table Monitoring Changes

With Oracle Database 10g, the statistics_level initialization parameter functions as a global option for the table monitoring mechanism. This mechanism overrides the table level MONITORING clause. In other words, the [NO] MONITORING clauses are now obsolete. The statistics_level parameter was available in 9i.

If the statistics_level parameter is set to BASIC, the monitoring feature is disabled. When it is set to TYPICAL (which is the default setting) or ALL, then the global table monitoring is enabled.

Note:
Do not use the COMPUTE and ESTIMATE clauses of ANALYZE statement to collect optimizer statistics. These clauses are supported solely for backward compatibility and may be removed in a future release. The DBMS_STATS package collects a broader, more accurate set of statistics, and gathers statistics more efficiently.

You may continue to use ANALYZE statement to for other purposes not related to optimizer statistics collection:
- To use the VALIDATE or LIST CHAINED ROWS clauses
- To collect information on free list blocks

...

2 comments:

Anonymous said...

Hi,
Sir it Really helped me a lot in understanding about Statisitcs gud work

Surachart Opun said...

DBMS_STATS and ANALYZE

If you use DBMS_STATS, it'll not collect about EMPTY_BLOCKS

EMPTY_BLOCKS: Among the allocated blocks, the blocks that were never used

You need to use ANALYZE.