【性能】解讀10053文件
About Oracle Case Studies
Oracle Case Studies are intended as learning tools and for sharing information or knowledge related to a complex event, process, procedure, or to a series of related events. Each case study is written based upon the experience that the writer/s encountered.
Each Case Study contains a skill level rating. The rating provides an indication of what skill level the reader should have as it relates to the information in the case study.
Ratings are:
- Expert: significant experience with the subject matter.
- Intermediate: some experience with the subject matter.
- Beginner: little experience with the subject matter.
Case Study Abstract
This case study uses a tracefile from an actual service request to illustrate a methodology for analyzing 10053 trace files. Please note that event 10053 trace files were designed to assist Oracle developers and support personnel to diagnose optimizer problems and are subject to change with every new patchset or release. Our intent with this case study is not to provide a comprehensive reference to the 10053 trace file, but rather to show how this trace file is used by Oracle engineers. Along the way, we will also gain insight into how the Cost Based Optimizer (CBO) arrives at the cost for a query and, ultimately, how it arrives at the execution plan. It is also important to point out that the algorithms used by the CBO to estimate the cost of a query may change from version to version as well.
Our task here is to analyze the bad plan and determine how the CBO computed the costs that led to the cheaper, but flawed plan. We will compare both 10053 traces at some points, but mainly focus on how the costs were computed for the bad plan. The good plan is basically a shorter version of the bad plan (one in which no indexes were considered).
The reason for examining a 10053 is typically to understand why the CBO made its decisions. The 10053 will help answer the question "Why was my index not used?" or the related, "Why did the CBO choose an Full Table Scan (FTS)?". The 10053 is typically not the best place to start looking for tuning opportunities - the execution plan and TKProf have better information in this regard; the 10053 is used for deeper cause determination.
Case History
The situation is that an unhinted SQL statement (SELECT involving a 3-way join) will take over 9 hours to execute versus a statement with a "NO_INDEX" hint that will complete in less than 4 minutes. The tables are partitioned and parallel query is being used. In addition, the customer has set "OPTIMIZER_INDEX_CACHING" to 70 (for unknown reasons, but we can bet it is probably because they weren't getting plans they liked). This parameter has the effect of decreasing the cost of single-block index I/O by 70 percent.
10053 traces were obtained for both the unhinted ("bad") plan and the hinted ("good") plan. The main difference between these two plans is that the bad plan uses all nested loop joins, with the inner-most join having an INDEX FULL SCAN (not index fast full scan) operation as the inner rowsource. The good plan uses a hash join with an INDEX FAST FULL SCAN (IFF) operation as the inner rowsource
The original trace file for the bad plan is
here
. Please note that this case study is very sensitive to changes that may occur to the 10053 trace file in the future. Generally, later versions of the trace file should be easier to read and require less effort in the "Pre-Analysis Work" section.
Pre-Analysis Work
Before jumping to the analysis of the trace files, we must make some observations and derive some factors that the CBO will use to compute its costs. Sometimes, the value of some of these parameters and factors will provide a good initial indication about why a particular plan was chosen over competing plans.
To collect the event 10053 trace file, the following syntax was used in SQLPlus:
SQL> connect / as sysdba
SQL> oradebug setmypid
SQL> oradebug unlimit
SQL> oradebug event 10053 trace name context forever, level 1
SQL> ...enter your query here...
SQL> oradebug event 10053 trace name context off
SQL> oradebug tracefile_name
/chia/web/admin/PTAV3/udump/ptav3_ora_15365.trc |
The output of "oradebug tracefile_name" will point to the 10053 trace file.
A) Verify the query being traced
This is an important step because we want to be sure we are examining the trace for the associated query. Find the SQL within the section of the trace file called "
QUERY
" and make sure it is the correct one. In version 10g, the QUERY section is at the END of the trace when no bind variables are present in the query, otherwise, it will be at the beginning of the trace. Be careful to note which trace is associated with the QUERY section being considered. Sometimes one can easily mistake a QUERY at the end of a trace as belonging to a trace that follows (likely to happen in 10g with no binds in the SQL).
B)
Parameters
of interest:
OPTIMIZER_FEATURES_ENABLE = 9.2.0
_OPTIMIZER_PERCENT_PARALLEL = 101
OPTIMIZER_INDEX_CACHING = 70
This parameter will affect the cost of an index access operation by multiplying the original cost of using the index by (100 - optimizer_index_caching)/100. So, in this case, it will reduce index read costs using this factor: (100 - 70)/100 = 0.3 or about 1/3. This means that index costs will be multiplied by 0.3 making the cost 1/3 of what they would otherwise be. Please note that index I/O costs are computed from the values of "BLEVEL", "LEAF_BLOCKS", and "CLUF" (clustering factor); this parameter only affects the portion of the costs contributed by BLEVEL and LEAF_BLOCKS. CLUF affects the cost of accessing the table and is not discounted by OPTIMIZER_INDEX_CACHING.
OPTIMIZER_INDEX_COST_ADJ = 99
This parameter will be used to scale the cost of index access operation by the fraction:
optmizer_index_cost_adj / 100. In this case, it will be 99/100 or 0.99. This parameter will affect all index costs, even those used in joins.
OPTIMIZER_DYNAMIC_SAMPLING = 1
This parameter controls how aggressively the CBO relies on dynamic sampling to give it information about the cardinalities and selectivities needed to cost access paths. At a setting of 1, it will basically only sample if statistics are missing from tables in the query.
_OPTIMIZER_COST_MODEL = CHOOSE
If set to CHOOSE, the CBO will use the new CPU model if system statistics were collected. If set to I/O, it will use the old cost model that ignores CPU cost.
DB_FILE_MULTIBLOCK_READ_COUNT = 64
This parameter may control the costs of performing full table or index scans. Higher values of this parameter will cause the CBO to cost full table or index scans cheaper. The value is scaled down by the CBO using either a fixed formula (when OPTIMIZER_COST_MODEL = io) or scaling factors computed from actual system statistics that were gathered.
_CPU_TO_IO = 0 (default)
This parameter is used to scale CPU cycles to an I/O cost for the purpose of calculating an overall cost using CPU and I/O cost components. If set to 0, the default value, the CBO will use either an internal fixed value, or a value derived from data collected with the system statistics (which includes CPU speed, single block I/O time, multiblock I/O time, and average number of blocks read during multiblock I/O). In order to verify costs when the CBO is taking CPU costs into account, it is important to determine what the CBO is using for this value.
C) Calculate the CPU to I/O ratio used by the CBO
To determine the value for _CPU_TO_IO being used by the CBO, one must find an entry in the 10053 that shows each of the values: CPU cost, I/O cost, combined cost. We can compute the factor used internally by the CBO by knowing these three values and then applying them to the formula that relates them.
The value of _CPU_TO_IO stays constant throughout the 10053 and is the same for any calculation, so any entry that shows all three components is usable for finding the value. The important thing to remember is to pick an entry with large values for CPU and I/O costs; the higher the values, the more accurate will be the derived factor.
- Find an index fast full (IFF) scan or something where CPU, IO and total are shown.
Access path: index (iff)
Index: PK_CIPBF_IX
TABLE: CERT_INSURED_PLAN_BENEFIT_FACT
RSC_CPU: 2865229980 RSC_IO: 52693
IX_SEL: 0.0000e+00 TB_SEL: 1.0000e+00
Access path: iff Resc: 55630 Resp: 27815 |
Combined Resc Cost = (RSC_CPU cost / _cpu_to_io) + RSC_IO Cost
Solve for _cpu_to_io:
_cpu_to_io = RSC_CPU Cost / (Combined Cost - RSC_IO Cost)
= 2865229980 / (55630 - 52693)
=
975563.49 CPU cycles per IO
D) Calculate the multiblock read divisor
When the CBO estimates the cost for a full table scan or an index fast full scan, it will divide the total number of blocks in a table or index by a divisor that is an estimate of how many blocks will be read for each physical read from the disk (called here the
multiblock read divisor or MBDivisor
). In the past, the parameter "DB_FILE_MULTIBLOCK_READ_COUNT was used as a basis for estimating the value of the MBDivisor (its value was reduced by a formula to compensate for real-world limitations). In version 9.2 and later, the value of the MBDivisor is estimated differently if system statistics are collected.
Before analyzing the 10053, it is useful to derive the divisor value that the CBO is using so we can quickly see if the value is realistic. Very low values will cause the CBO to cost FTS and IFF more expensive than index access paths; high values will tend to cost FTS / IFF much cheaper.
To derive the divisor, find a single table access path entry and obtain the resource cost ("Resc") for a table scan ("tsc"). Then find the total number of blocks in the table. Calculate the factor as follows:
tsc cost = Num Blocks / MBDivisor
Solve for the MBdivisor: MBdivisor = Num Blocks / tsc cost
For example:
From the "Base Statistical Information" section:
Table stats Table: CERTIFICATE Alias: A12
PARTITION [95] CDN: 3164453 NBLKS: 125272 AVG_ROW_LEN: 617
TOTAL :: CDN: 3164453 NBLKS: 125272 AVG_ROW_LEN: 617 |
From the "Single Table Access Path" section:
SINGLE TABLE ACCESS PATH
...
TABLE: CERTIFICATE ORIG CDN: 3164453 ROUNDED CDN: 2089 CMPTD CDN: 2089
Access path: tsc Resc: 116982 Resp: 29052 |
Mdivisor = Nblks / tsc
Mdivisor = 125272 / 116982 = 1.07
Note: This factor seems very low. It will be very helpful to see what the system stats are and how this value was calculated. The low value will make FTS and IFF scans expensive relative to index scans. Its possible that these values are realistic, but it is also possible the system stats were obtained during a time that was not representative of actual load. Considering how expensive FTS will be, it is interesting that the customer felt they had to bring down index costs further using the index_cost_adjustment parameter. Its easy to see that this database will prefer almost any kind of index access over FTS in many cases.
It seems unusual that the customer set OPTIMIZER_INDEX_COST_ADJ so high when the multiblock divisor was already low and probably discouraged the CBO to choose non-index paths anyway. It would be helpful to know the historical reasons why the customer set that value.
E) Scan the section called "
BASE STATISTICAL INFORMATION
" and "
SINGLE TABLE ACCESS PATH
"
sections looking for missing or insufficient statistics
Typical problems include:
- Missing statistics on tables and indexes
If stats have not been gathered for an object, you will see messages like this:
"(NOT ANALYZED)".
Unfortunately, for indexes there is no message explicitly stating that they were not analyzed. Instead, you will have to read the stats for the index and look for the default stats. The default stats for LEAF_BLOCKS is 25 and for CLUSTERING_FACTOR is 800.
For partitioned objects, determine whether global only statistics or partition level statistics have been gathered.
To detect
global only
(no partition level statistics gathered), look for UNANALYZED partitions. Example (not from the current trace):
Table stats Table: SALES Alias: SALES
(Using composite stats)
(making adjustments for partition skews)
ORIGINAL VALUES:: CDN: 919315 NBLKS: 1768 AVG_ROW_LEN: 29
PARTITIONS::
PRUNED: 5
ANALYZED: 0 UNANALYZED: 5
TOTAL :: CDN: 919315 NBLKS: 94 AVG_ROW_LEN: 29 |
Unfortunately, there is no way to tell if the composite stats were obtained by a global sampling or aggregating individual partitions. Global sampling is preferred, so if in doubt, check if DBA_TABLES.GLOBAL_STATS equals YES to confirm that global statistics were gathered.
Look for statements like "No Histogram" for particular columns. Example (not from the current trace):
SINGLE TABLE ACCESS PATH
COLUMN: TIME_ID(DATE) Col#: 3 Table: SALES Alias: SALES
Size: 8 NDV: 1187 Nulls: 0 Density: 8.4246e-04 Min: 2450815 Max: 2452275
No Histogram: #BKT: 1
(1 uncompressed buckets and 2 endpoint values) |
Analysis
Summary
The procedure used to analyze a 10053 is basically to work from the bottom of the trace file up to the areas of interest. An area is of interest when it pertains to a part of the plan that appears to be flawed.
In summary, here are the steps to analyze a 10053 trace file:
1. Start with the end
Verify the SQL you are interested in is the one belonging to the trace you are about to analyze. If an EXPLAIN PLAN is in the trace, indent it to make it readable.
2. Look at the final cost of the chosen plan
3. Find the join order that produced the final cost
4. Find the part of the join order where the costing is of interest
5. Find the particular join type that produced the cost for the join order
6. Examine the costs for the join type found in step 5.
a. determine which access paths were used
b. examine other access paths that were rejected (applies only to nested loop joins where more than one access path to the inner rowsource is costed)
Detailed Analysis
1. Start with the end, the Explain plan
Its important to always obtain an accurate execution plan output along with a 10053. Some information is simply missing in the 10053 and we'll need to use the plan to draw a conclusion about how the costing was performed. Other times we'll want to use the plan to navigate the 10053 or sanity check our own analysis.
Make sure to indent the plan steps according to
parent ID
to make the hierarchy of the plan readable.
Please note the explain plan output embedded in the trace file differs between versions and is often not present. In 10g, you can obtain a nicely formatted plan by turning on the 10053 trace and then executing an EXPLAIN PLAN FOR.... command. However, the presence of bind values may affect the actual plan that is produced and the EXPLAIN PLAN command will not catch it due to a different code path used to perform the bind peeking.
2. Look at final cost of the plan:
Final:
CST: 20762 CDN: 1 RSC: 83447 RSP: 20762 BYTES: 173
IO-RSC: 20658 IO-RSP: 82626 CPU-RSC: 101017010 CPU-RSP: 801120184
PLAN
Cost of plan: 20762
Operation...........Object name.....Options.........Id...Pid..
SELECT STATEMENT 0
SORT GROUP BY 1
TABLE ACCESS CERTIFICATE BY LOCAL INDEX R 2 1
NESTED LOOPS 3 2
NESTED LOOPS 4 3
TABLE ACCESS PREMIUM_PLAN_COD FULL 5 4
INDEX PK_CIPBF_IX FULL SCAN 6 4
INDEX XPKCERTIFICATE RANGE SCAN 7 3 |
The better plan (in a another trace file) which used the NO_INDEX hint, looked like this:
Cost of plan: 58201
Operation...........Object name.....Options.........Id...Pid..
SELECT STATEMENT 0
SORT GROUP BY 1
TABLE ACCESS CERTIFICATE BY LOCAL INDEX R 2 1
NESTED LOOPS 3 2
HASH JOIN 4 3 <== hash join instead of NL
TABLE ACCESS PREMIUM_PLAN_COD FULL 5 4 <== full table scan
instead of index
full scan
TABLE ACCESS CERT_INSURED_PLA FULL 6 4
INDEX XPKCERTIFICATE RANGE SCAN 7 3 |
3. Find the Join Order that produced the final cost
Use the cost (20762) as a key to find the join order that evaluated to this cost. We find this:
Join result: cost: 20762 cdn: 1 rcz: 173
Best so far: TABLE#: 0 CST: 15810 CDN: 1 BYTES: 122
Best so far: TABLE#: 2 CST: 20266 CDN: 1981 BYTES: 277340 Best so far: TABLE#: 1 CST:20762 CDN: 1 BYTES: 173 |
This belongs to join order number 2.... scroll up to the beginning of the join order section to see this:
Join order[2]
: PREMIUM_PLAN_CODE [A13] CERT_INSURED_PLAN_BENEFIT_FACT [A11] CERTIFICATE [A12]
Finding the join order of the chosen execution plan in 9206 and 10g is easier since we have following trace in 10053 :
JOIN ORDER: 2
CST: ... CDN: ... RSC: ... RSP: ... BYTES: ... |
In this case, the chosen "JOIN ORDER" is 2.
4. Find the part of the join order that differs between the "good" plan and the "bad" plan.
In this case, they differ in the cost of the second table in the join order:
"Good" Plan:
Join result: cost: 58201 cdn: 1 rcz: 173
Best so far: TABLE#: 0 CST: 15810 CDN: 1 BYTES: 122
Best so far: TABLE#: 2 CST: 57706 CDN: 1981 BYTES: 277340
Best so far: TABLE#: 1 CST: 58201 CDN: 1 BYTES: 173 |
"Bad Plan"
Join result: cost: 20762 cdn: 1 rcz: 173
Best so far: TABLE#: 0 CST: 15810 CDN: 1 BYTES: 122
Best so far: TABLE#: 2 CST: 20266 CDN: 1981 BYTES: 277340 <== this cost is different (20266 vs 57706)
Best so far: TABLE#: 1 CST: 20762 CDN: 1 BYTES: 173 |
5. Search for the join within the join order section that produced the cost identified in step 4.
You will find this line as you search UP the file for the cost of 20266:
Join result: cost: 20266 cdn: 1981 rcz: 140 |
Keep searching to find which join produced this (lowest) cost:
Best NL cost: 20266 resp: 20266 |
We knew from the plan that it was going to be an NL join, so this confirms it too. Search within the NL join section for the calculation that lead to this cost:
NL Join
Outer table: cost: 15810 cdn: 1 rcz: 122 resp: 15809
Inner table: CERT_INSURED_PLAN_BENEFIT_FACT
...
OPTIMIZER PERCENT INDEX CACHING = 70
Access path: index (no sta/stp keys)
Index: PK_CIPBF_IX
TABLE: CERT_INSURED_PLAN_BENEFIT_FACT
RSC_CPU: 116668803 RSC_IO: 17885
IX_SEL: 1.0000e+00 TB_SEL: 2.1281e-04
Join: resc: 81471 resp: 20266 |
At this point we see that the best cost was found for the "resp" cost. This is the "response time" cost; i.e., the cost for a plan that uses PX to obtain the answer in the shortest possible time. The "resc" cost is the "resource cost". This is the resource consumption cost when executing the query serially. The full execution plan output would show whether PX was being used or not.
We need to figure out how this cost was calculated. To do this, we need to see how the CBO costed the join. Here are the basic formulae with the actual values substituted:
Basic NL join cost formula: COST(outer) + [ CARD(outer) * COST(inner) ] |
Note: In the formulae below, "RESC(outer)" refers to the resource cost of accessing the inner table. "RESP(outer)" refers to the response cost (using PX) of the outer table.
resc = RESC(outer) + [CARD(outer) * RESC(inner)]
= 63646 + [ 1 * (rsc_cpu / cpu_factor + rsc_io) * index_cost_adj ]
= 63646 + [ 1 * ( 116668803 / 975563.49 + 17885) * 0.99 ]
= 63646 + 17824.5
= 81470.5 ~ 81471: OK
Resp = RESP(outer) + (CARD(outer) * RESC(inner) )
= 15809 + [1 * (rsc_cpu / cpu_factor + rsc_io)/(deg of join parallelism * parallel scaling factor) *index_cost_adj ]
= 15809 + [1 * (116668803 / 975563.49 + 17885)/(4 * 0.9) * 0.99 ]
= 15809 + 18004.59 / 3.6 * 0.99 = 20602.17 vs. 20266, close...but not exact...costing has fudge factors? |
This join permutation uses a parallel operation for the outer table and parallel slaves for the inner table, but each slave uses a full index access path for the inner table. Since its an NL join, its probably using "broadcast" PX row distribution to do this. Without the execution plan, its tough to know if this is what the CBO chose to do.
(deg of join parallelism * parallel scaling factor)= 4 * 0.9 = 3.6 |
So, instead of dividing the cost by 4, it divides the cost by 3.6.
The degree of parallelism for the outer table's full table scan is seen
here
. The join's degree of parallelism is set to the highest degree of parallelism for any particular table. In this case it is set to the degree of parallelism for the PREMIUM_PLAN_CODE table. A full execution plan output would be very helpful to verify this.
Other Costs in the NL Join for the table "CERT_INSURED_PLAN_BENEFIT_FACT"
Using an FTS:
NL Join
Outer table: cost: 15810 cdn: 1 rcz: 122 resp: 15809 (Note: serial tsc cost = 63646)
Inner table: CERT_INSURED_PLAN_BENEFIT_FACT
Access path: tsc Resc: 167233 (NOTE: parallel tsc cost = 41309)
Join: Resc: 230879 Resp: 57618 <== cost = 15809 + 41309 = 57118 |
Resc = Resc(outer) + [Card(outer) * Resc(inner) ]
=
63646
+ [1 *
167233
] (Note: Resc(Inner) is close to the value in single table access path, but not exact)
= 230879 (exact)
Resp = Resp(outer) + [Card(outer) * Resp(inner) ]
=
15809
+ [ 1 *
41309
]
= 57118 vs. 57618 (close, not exact)
Using an Index Fast Full Scan:
NL Join
Outer table: cost: 15810 cdn: 1 rcz: 122 resp: 15809 (Note: serial tsc cost = 63646)
Inner table: CERT_INSURED_PLAN_BENEFIT_FACT
...
Access path: index (iff)
Index: PK_CIPBF_IX
TABLE: CERT_INSURED_PLAN_BENEFIT_FACT
RSC_CPU: 2865229980 RSC_IO: 52693
IX_SEL: 0.0000e+00 TB_SEL: 1.0000e+00
Inner table: CERT_INSURED_PLAN_BENEFIT_FACT
Access path: iff Resc: 55630
Join: Resc: 119276 Resp: 29717 |
Resc = Resc(outer) + [ Card(outer) * Resc(Inner) ]
=
63646
+ [ 1 * 55630 ]
=
119276
(exact)
Resp = Resp(outer) + [Card(outer) * Resp(inner) / (degree of join parallelism) ]
=
15809
+ [ 1 *
27815
/
2
]
= 29716.5 =
29717
(exact)
Costs for other join types (SMJ and HJ):
SM Join
Outer table:
resc: 63646 cdn: 1 rcz: 122 deg: 4 resp: 15809
Inner table: CERT_INSURED_PLAN_BENEFIT_FACT
resc: 55630 cdn: 22878070 rcz: 18 deg: 2 resp: 27815
using join:1 distribution:2 #groups:1
SORT resource Sort statistics
Sort width: 598 Area size: 1048576 Max Area size: 104857600 Degree: 1
Blocks to Sort: 1 Row size: 145 Rows: 1
Initial runs: 1 Merge passes: 0 IO Cost / pass: 0
Total IO sort cost: 0
Total CPU sort cost: 975652
Total Temp space used: 0
SORT response Sort statistics
Sort width: 598 Area size: 20971520 Max Area size: 104857600 Degree: 4
Blocks to Sort: 1 Row size: 145 Rows: 1
Initial runs: 1 Merge passes: 0 IO Cost / pass: 0
Total IO sort cost: 0
Total CPU sort cost: 1084058
SORT resource Sort statistics
Sort width: 598 Area size: 1048576 Max Area size: 104857600 Degree: 1
Blocks to Sort: 84029 Row size: 30 Rows: 22878070
Initial runs: 7 Merge passes: 1 IO Cost / pass: 149862
Total IO sort cost: 233891
Total CPU sort cost: 27269777197
Total Temp space used: 1288102000
SORT response Sort statistics
Sort width: 598 Area size: 20971520 Max Area size: 104857600 Degree: 2
Blocks to Sort: 42015 Row size: 30 Rows: 11439035
Initial runs: 4 Merge passes: 1 IO Cost / pass: 74932
Total IO sort cost: 129941
Total CPU sort cost: 14577786635
Merge join Cost: 381119 Resp: 188508 |
Resc cost = Resc(outer) + Resc(inner) + Sort_Cost(outer) + Sort_Cost(inner
= Resc(outer) + Resc(inner) + [ (CPU_Cost(outer) + IO_Cost(outer) ) + (CPU_Cost(inner) + IO_Cost(inner) ]
=
63646
+
55630
+ (
975652
/
975563.49
+
0
) + (
27269777197
/
975563.49
+
233891
)
=
381120.8 ~
381119
(very close)
Resp cost = Resp(outer) + Resp(inner) + Par_Sort_Cost(outer) + Par_Sort_Cost(inner)
= Resp(outer) + Resp(inner) + [ (Par_CPU_Cost(outer) + Par_IO_Cost(outer) ) + ( (Par_CPU_Cost(inner) + Par_CPU_Cost(inner)) ]
=
15809
+
27815
+ [ (
1084058
/ 975563.49 +
0
) + (
14577786635
/ 975563.49 +
129941
) ]
=
188509.1
~
188508
(very close)
HA Join
Outer table:
resc: 63646 cdn: 1 rcz: 122 deg: 4 resp: 15809
Inner table: CERT_INSURED_PLAN_BENEFIT_FACT
resc: 55630 cdn: 22878070 rcz: 18 deg: 2 resp: 27815
using join:8 distribution:2 #groups:1
Hash join one ptn Resc: 587 Deg: 4
hash_area: 5120 (max=25600) buildfrag: 1 probefrag: 20946 ppasses: 1
buildfrag: 1 probefrag: 20946 passes: 1
Hash join Resc: 121625 Resp: 44212 |
Resc cost = Resc(outer) + Resc(inner) + HJ_Cost_Ser
=
63646
+
55630
+ (HJ_Resc_Cost * HJ_Dop)
= 63646 + 55630 + (
587
*
4
)
= 121624 ~
121625
(very close)
Resp cost = Resp(outer) + Resp(inner) + HJ_Cost_Par
= 15809 + 27815 + 587
= 44211 ~
44212
(very close)
Note: The RESC and RESP costs used for the SMJ and HA are from the single table access cost section for each table. I.e.,
RESC(inner) is the IFF resc cost and RESP(inner) is the IFF resp cost.
Conclusion
The end result here is that with the unhinted ("bad") plan, the CBO chooses to do a full index scan with an NL join. This was influenced by:
- the high cost of FTS (due to the very low value for the "multiblock read divisor")
- the low cost of index access due to the OPTIMIZER_INDEX_CACHING parameter. This parameter had the effect of significantly lowering index access costs, beyond what was reasonable for the system.
Since we know the better plan performed full table scans and had much better performance, we can see the "multiblock read divisor" appears to be inaccurate in this case. A more accurate divisor would have made the cost of FTS/IFF much cheaper and a plan using a hash join would have been more attractive. An even better performing plan might be a nested loops plan that uses an IFF for the inner rowsource (similar to the hash join plan, but avoids the hashing costs). The IFF was not chosen for the nested loop inner rowsource in this case because its cost calculation did not receive the cost reduction benefit of the OPTIMIZER_INDEX_CACHING factor (only applies to indexes read with single block read operations). Furthermore, the relative cost of IFF was also raised by the low multiblock read divisor.
The next steps to investigate for this service request include:
- Find out why the multiblock read divisor is so low (check system stats in aux_stats$)
- Gather system stats again during a realistic interval into a named "stattab" for comparison to the exisiting values in aux_stats$
- Find out why the customer set OPTIMIZER_INDEX_CACHING so high
Various articles of interest can be found on Oracle Metalink. Access the articles using the following steps:
- Go to
MetaLink and login
- Click on the "Knowledge" tab
- Click on "Database" under the "Support Categories" menu
- Click on "SQL Tuning and Application Access - Including CBO" under "Performance and Scalability"
- View the variety of documents by clicking the links in the "Browse Product" menu
For additional information, we suggest looking at the following (please note this document is from sources outside of Oracle Corporation):
A Look Under the Hood of CBO: Event 10053
,
Wolfgang Breitling, Centrex Consulting Corporation
Cost-Based Oracle, Fundamentals
, Jonathan Lewis, Apress, 2006