Mauro Pagano's Blog


12 Comments

Introducing Pathfinder, is there a better plan for my SQL?

Pathfinder is a new free tool that provides an easy way to execute a SQL statement under multiple optimizer environments in order to generate different execution plans, potentially discovering better plans. The tool can also be used to quickly identify workarounds for wrong result bugs as well as slow parse issues.

Pathfinder uses the same brute-force approach of SQLT XPLORE, executing the SQL for every single CBO parameter and fix_control present in the database, with no installation required. This make Pathfinder easy to run in any environment, including a production one (assuming you understand *WELL* what the tool does, how it works and what it means for your database).

Each test adds approximately 1 second overhead to the time the SQL takes to complete and the amount of tests considered is pretty high, in 11.2.0.4 it’s around 1100 and around 1500 in 12.1.0.2, thus I suggest to use Pathfinder on SQLs that take at most a few seconds to run (or just be ready to leave Pathfinder run for a loooong time).

The tool executes the SQL statement present in file script.sql (provided), just modify the script and replace the seeded SQL with the one you want to execute. In the same script you can also add ALTER SESSION commands that will be executed before the desired SQL, this is helpful in case you want to influence the analysis providing a different starting point.

To execute the tool just download it from the Download section on the right side of this page (or from here, also the tool will be released as standalone script in the same zip file as SQLd360) and follow these steps:

  1. Unzip pathfinder-master.zip, navigate to the root pathfinder directory, and connect as SYS to the database.
    $ unzip pathfinder-master.zip
    $ cd pathfinder-master
    $ sqlplus / as sysdba
  2. Open file script.sql and add your SQL in there. Make sure to add the mandatory comment /* ^^pathfinder_testid */. The file name must be script.sql, if you wish to change the name then just ping me.
  3. Execute pathfinder.sql and provide the connect string to connect as the user that is supposed to run the script.
  4. Unzip output file pathfinder_<dbname>_<date>.zip into a directory on your PC and review the results starting from file 00001_pathfinder_<dbname>_<date>_index.html

SQL> @pathfinder

Parameter 1:
Full connect string of the database to run the SQL into
If the database is remote or a PDB then you must include
the TNS alias i.e. scott/tiger@orcl

Enter value for 1: mpagano/mpagano@orcl
mpagano/mpagano@orcl
Building Pathfinder driver scripts
Connected.

1) "pathfinder_{ 20151026_1906 (00001)" 19:06:40 BASELINE

2) "pathfinder_{ 20151026_1906 (00002)" 19:06:42 "_add_stale_mv_to_dependency_list" = FALSE

.....
File pathfinder_orcl_20151023_1256.zip created.

For each test Pathfinder will show the setting considered as well as some basic statistics like Plan Hash Value, Elapsed Time, CPU Time, Buffer Gets and Rows processed. Also two links are present, one points to the details of the execution plan generated while the other points to the details of V$SQL.

The main page will look something like this:

Screen Shot 2015-10-26 at 9.18.44 PM

Pathfinder also considers the effect of Cardinality Feedback executing the SQL multiple times until the plan stops changing (or CFB gives up after the fifth parse), for all those settings that lead to a first execution plan different than the baseline.
This is why for some settings you will see a Test# with an additional digit, the “reparse” number:

Screen Shot 2015-10-26 at 9.22.23 PM

In example for Test# 117 above the CBO generated a different plan (PHV 1837274416) than the baseline (PHV 1838229974) and Cardinality Feedback kicked in 3 times generating a different plan each time, until the 3rd parse when the CBO ended up with the same plan as the first execution (and no different plan was generated after).

This is the first release of the tool so I expect it to be far from perfect but I’ve already used it several times with success. Hopefully with time (and your feedbacks :D) the tool will get better and better.

I hope you enjoy it and please don’t hesitate to get in touch with me for feedbacks, suggestions and bugs!!!


3 Comments

Scalar Subquery Caching and read consistency

Every time I discuss this topic with somebody I get the “what are you talking about?” look on his/her face so this is just one of those “repetita iuvant” posts to make sure we (including me first) don’t get burned by it 😀

Each SQL executed from PL/SQL can run at a different SCN thus “see” different data depending when the SQL is started. It is known, documented and easy to prove.

In session 1:
declare
 l_sal number;
begin
 select sal
   into l_sal
   from emp
  where empno = 7369;
 dbms_output.put_line('First salary: '||l_sal);
 dbms_lock.sleep(10);
 select sal
   into l_sal
   from emp
  where empno = 7369;
 dbms_output.put_line('Second salary: '||l_sal);
end;
/

In session 2 while the previous block is still running:
update emp set sal = 0 where empno = 7369;
commit;

Result will be
First salary: 800
Second salary: 0

The same applies when a SQL calls a PL/SQL program with in turn calls SQLs.
This is documented and easy to prove as well plus a nice presentation from Bryn Llewellyn (PL/SQL Product Manager, I think) is available here. I’ll skip the steps to keep the post short.

So using a SQL calling PL/SQL calling SQL is a big NO if the table accessed by the recursive SQLs aren’t static (read only?) or if you are not willing to get weird results like the following where salary for the same employee is different within the same resultset (and I’ve no idea who would be willing 😀 )

create or replace function get_salary(n1 number) return number is
l_sal number;
begin 
  dbms_lock.sleep(1);
  select sal
    into l_sal
    from emp
   where empno = n1;
  return l_sal;
end;
/
select empno, get_salary(empno) sal 
  from (select /*+ NO_MERGE */ * from emp where empno in (7369, 7499, 7521)) a, 
       (select /*+ NO_MERGE */ rownum from dual connect by rownum <5) b;

     EMPNO	  SAL
---------- ----------
      7369	  800
      7499	 1600
      7521	 1250
      7369	    0
      7499	    0
      7521	    0
      7369	    0
      7499	    0
      7521	    0
      7369	    0
      7499	    0
      7521	    0

Elapsed: 00:00:12.01

Scalar Subquery Caching is a caching mechanism to cache the result of a function call, thus potentially avoiding a call to the function for the same input values and so improving performance. The mechanism is explained in Bryn presentation, Tim Hall video on youtube and lots of other places so no need to add much here 😉

Would Scalar Subquery Caching make my recursive SQL read-consistent to its “parent” SQL?
To be honest I don’t see why it should but I thought about it up to the point of starting to make up weird hypothesis with no foundation so better cut it short and test!

In session 1:
select empno, (select myfunc(empno) from dual) sal 
  from (select /*+ NO_MERGE */ * from emp where empno in (7369, 7499, 7521)) a, 
       (select /*+ NO_MERGE */ rownum from dual connect by rownum <5) b;

In session 2 (need to be quick this time) 
update emp set sal = 0 where empno in (7369, 7499, 7521); 
commit;

and the result in session 1:
     EMPNO	  SAL
---------- ----------
      7369	 1000
      7499	 1000
      7521	    0
      7369	 1000
      7499	 1000
      7521	    0
      7369	 1000
      7499	 1000
      7521	    0
      7369	 1000
      7499	 1000
      7521	    0

Elapsed: 00:00:03.00

The answer is “of course NO, what were you (me) thinking?!?”.
The previous output shows how the scalar subquery caching indeed allowed to call get_salary less times (once per distinct value of EMPNO, 3 in this case) and the elapsed drop from 12 to 3 but the value cached is just whatever the first execution of the function calculated for that EMPNO.


1 Comment

UNUSABLE indexes and TRUNCATE

This is one of those “I knew it works this way, why do I keep falling for it?” so hopefully blogging about it will keep my memory fresh 😀

Starting 11gR2 when you mark an index or an index partition UNUSABLE the segment is dropped (unless you are on 12c and use ONLINE), this is the intended behavior.

SQL> create table t_unusable (n1 number, n2 number);
SQL> create index t_unusable_idx on t_unusable(n1);
SQL> insert into t_unusable values (1,2);  -- this is just to materialize the segment
SQL> select index_name, segment_created from user_indexes where index_name = 'T_UNUSABLE_IDX';

INDEX_NAME		       SEG
------------------------------ ---
T_UNUSABLE_IDX		       YES

SQL> alter index t_unusable_idx unusable;
SQL> select index_name, segment_created from user_indexes where index_name = 'T_UNUSABLE_IDX';

INDEX_NAME		       SEG
------------------------------ ---
T_UNUSABLE_IDX		       NO

SQL> select object_id, data_object_id from user_objects where object_name = 'T_UNUSABLE_IDX';

 OBJECT_ID DATA_OBJECT_ID
---------- --------------
     70989

(*)

This is a nice way to speed up large data loads since it will reduce REDO generated plus it’s faster to rebuild an index than to maintain it online.
What I keep forgetting is that if you truncate the table (or the partition) the segment is recreated, hence your data load process will end up maintaining the index online 😦

SQL> truncate table t_unusable;
SQL> select index_name, segment_created from user_indexes where index_name = 'T_UNUSABLE_IDX';

INDEX_NAME		       SEG
------------------------------ ---
T_UNUSABLE_IDX		       YES

SQL> select object_id, data_object_id from user_objects where object_name = 'T_UNUSABLE_IDX';

 OBJECT_ID DATA_OBJECT_ID
---------- --------------
     70989	    70992

----- begin tree dump
leaf: 0x1024b7b 16927611 (0: nrow: 0 rrow: 0)
----- end tree dump

Conclusion (aka note to self): if you plan to use TRUNCATE + mark indexes UNUSABLE remember to TRUNCATE first and mark UNUSABLE after 😀

(*) there is no TREEDUMP here since the trace raised ORA-600 [25027]


5 Comments

Concurrent INSERT APPEND into the same table

The main reason I decided to have a blog was to share all the small things I learn every day while experimenting but I’ve to admit I fell short on my initial intention, hopefully I’ll do better in the coming months, starting with this super-quick post.

I always knew that concurrent (aka multiple sessions at the same time) INSERT /*+ APPEND */ into the same target table was not possible even if the target table was partitioned. Here is an example of a SQL

create table test_target (n1 number, n2 number)
partition by range(n1)
(partition p1 values less than (5),
 partition p2 values less than (10));

insert /*+ append */ into test_target select mod(rownum,5), rownum from dual connect by rownum <= 10;

Reason for the concurrency being not feasible is the first session acquires an enqueue TM in X mode on the whole table (so every other session will have to wait) even if the inserted data only goes into a specific partition.
In the following extract SID 306 (my session) is holding the enq TM in X mode on object id 138542, which is the table even though only partition P1 has been populated.

SQL> select * from v$lock where sid = 306;

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK     CON_ID
---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ----------
       306 TX     131102      11102          6          0         29          0          0
       306 TM     138542          0          6          0         29          0          3
       306 AE        133          0          4          0         86          0          3

SQL> select object_id, object_name, object_type from user_objects where object_id = 138542;

 OBJECT_ID OBJECT_NAME     OBJECT_TYPE
---------- --------------- -----------------------
    138542 TEST_TARGET     TABLE

If in another session (SID 186) I try to run an INSERT /*+ APPEND */ to populated only P2 then the execution will sit and wait for SID 306 to release the enqueue on object id 138542, as shown in the first row from V$LOCK where 186 wants to grab an enq TM in X mode for object id 138542

SQL> insert /*+ append */ into test_target select mod(rownum,5)+5, rownum from dual connect by rownum  select * from v$lock where sid in (306,186) order by sid;

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK     CON_ID
---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ----------
       186 TM     138542          0          0          6         94          0          3
       186 AE        133          0          4          0        104          0          3
       306 AE        133          0          4          0        640          0          3
       306 TM     138542          0          6          0        583          1          3
       306 TX     131102      11102          6          0        583          0          0 

Nothing new so far, it confirms what I expected.
What I didn’t know is that if “extended partition syntax” is used to specify the target partition (and if you specify the wrong one you get a correct ORA-14401) then the X lock is held on the partition and not the table (it’s held in SX on the table).
Here is the example

SID 306
insert /*+ append */ into test_target partition (p1) select mod(rownum,5), rownum from dual connect by rownum <= 10;

SID 186
insert /*+ append */ into test_target partition (p2) select mod(rownum,5)+5, rownum from dual connect by rownum  select * from v$lock where sid in (306,186) and type = 'TM' order by sid;

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK     CON_ID
---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ----------
       186 TM     138542          0          3          0         11          0          3
       186 TM     138544          0          6          0         11          0          3
       306 TM     138543          0          6          0         35          0          3
       306 TM     138542          0          3          0         35          0          3

SQL> select object_id, object_name, object_type from user_objects where object_id in (138542, 138543, 138544);

 OBJECT_ID OBJECT_NAME    OBJECT_TYPE
---------- -------------- -----------------------
    138544 TEST_TARGET    TABLE PARTITION
    138543 TEST_TARGET    TABLE PARTITION
    138542 TEST_TARGET    TABLE

So in case you know in advance where the data is going (which is generally true specially for range/interval partition) then you can have multiple concurrent loads into the same target table.
I’m not recommending it, just saying it’s possible 😀


5 Comments

AutoDOP in 12c, what’s new and what’s old

The New Features guide for 12c reports enhancements to AutoDOP but unfortunately there is not much details so let’s try to dig a little to find out which are (some of) those enhancements.

The focus on this blog post is on how the Degree of Parallelism (DoP) is computed for three SQLs in 11.2.0.4 vs 12.1.0.2 (so not considering ADAPTIVE degree policy) reading a single table, joins will come in a separate post.

First requirement to play with AutoDOP is to have IO Calibration statistics in place (UPDATE: the requirement is in 11.2.0.x, in 12c default values are used when IO Calibration are missing, 200MB/s is the default for MAX_PMBPS).
Since all my tests are on VMs on my personal machine the IO response time is very unstable so for the sake of consistent results I’ll manually set the IO Calibration stats (AFAIK this isn’t recommended, also the INSERT requires an instance restart to take effect).
We’ll use 10 as value for PMBPS (Max MBPS during parallel scan).

SQL> desc dba_rsrc_io_calibrate
 Name                  Null?    Type
 --------------------- -------- --------------
 START_TIME                     TIMESTAMP(6)
 END_TIME                       TIMESTAMP(6)
 MAX_IOPS                       NUMBER
 MAX_MBPS                       NUMBER
 MAX_PMBPS                      NUMBER
 LATENCY                        NUMBER
 NUM_PHYSICAL_DISKS             NUMBER

SQL> insert into dba_rsrc_io_calibrate values (systimestamp, systimestamp, 10000, 5, 10, 5, 10);
SQL> commit;

And here are the three tables with 10k, 50k and 16M  rows each used in the tests.
Each row is created to account for pretty much 1 block (in a 8k block_size database) in the first two tables so those tables will grow large in size even with just a few rows, the third table is designed to be very small.

SQL> drop table t10k purge;
SQL> drop table t50k purge;
SQL> drop table t16m purge;
SQL> create table t10k (n1 number, c1 char(2000), c2 char(2000), c3 char(2000));
SQL> create table t50k (n1 number, c1 char(2000), c2 char(2000), c3 char(2000));
SQL> create table t16m (n1 number, c1 char(1), c2 char(1), c3 char(1));
SQL> insert into t10k select rownum, 'a','b','c' from dual connect by rownum <= 10000;
SQL> insert into t50k select rownum, 'a','b','c' from dual connect by rownum <= 50000;
SQL> insert into t16m select rownum, 'a', 'b', 'c' from dual connect by rownum <= 250000; 
SQL> insert into t16m select * from t16m;  
SQL> /  
SQL> /  
SQL> /
SQL> /
SQL> /  
SQL> exec dbms_stats.gather_table_stats(user,'T10K');
SQL> exec dbms_stats.gather_table_stats(user,'T50K');
SQL>  exec dbms_stats.gather_table_stats(user,'T16M');
SQL> select table_name, num_rows, blocks from user_tables where table_name like 'T%' order by 2;
TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ---------- ----------
T10K                                10000      10097
T50K                                50000      50897
T16M				 16000000      35472

We’ll use three SQLs to check different scenarios and how they show up in the 10053 trace.
EXPLAIN PLAN has been used just to force a hard parse each time (it won’t have any “negative” side-effect here).

explain plan for select /* Q1 */ * from t10k;   
explain plan for select /* Q2 */ * from t50k;   
explain plan for select /* Q3 */ count(distinct n1) from t16m;

As we already known (nice post here) in 11.2.0.x the DoP under AutoDOP is based on “IO cost” for the operation and the calculation is performed for each table individually at the time the CBO calculates cardinality and best access method for the table (aka you’ll find it under SINGLE TABLE ACCESS PATH section of the 10053).

##### Q1 #####

The CBO decides to execute Q1 in serial in both versions, let’s look into the details.
From 11.2.0.4 trace, the first 4 lines in green are the input values used to perform the math, the next two are about In-Memory Parallel Execution (nothing to do with In-Memory Option) and the last three lines are the most interesting ones.

SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for T10K[T10K] 
  Table: T10K  Alias: T10K
    Card: Original: 10000.000000  Rounded: 10000  Computed: 10000.00  Non Adjusted: 10000.00
  Access Path: TableScan
    Cost:  2737.52  Resp: 2737.52  Degree: 0
      Cost_io: 2736.00  Cost_cpu: 74005180
      Resp_io: 2736.00  Resp_cpu: 74005180
kkeCostToTime: using io calibrate stats 
 maxmbps=5(MB/s) maxpmbps=10(MB/s) 
 block_size=8192 mb_io_count=4 mb_io_size=30247 (bytes) 
 tot_io_size=79(MB) time=7897(ms)
AutoDOP: Consider caching for T10K[T10K](obj#70600) 
cost:2737.52 blkSize:8192 objSize:10097.00 marObjSize:9592.15 bufSize:37240.00 affPercent:80 smallTab:NO affinitized:YES
AutoDOP:parallel operation is set to NO
AutoDOP:parallel operation is set to NO
AutoDOP: Table/Index(#70600) access (scan) cost=2737.52 estTime=7896.70 unit=10000.00 dop=2 -> maxdop=2
  Best:: AccessPath: TableScan
         Cost: 2737.52  Degree: 1  Resp: 2737.52  Card: 10000.00  Bytes: 0

The following is based on my understanding and tests so it could easily be wrong.

The first “AutoDOP: parallel operation is set to NO” refers to the lack of PARALLEL hint in the SQL, if you add the hint then the message becomes YES prefixed by another one with the degree specified in the hint (if any).

The second “AutoDOP: parallel operation is set to NO” is based on the fact this SQL is not “expensive enough” to qualify for parallel execution, basically the E-Time is lower than PARALLEL_MIN_TIME_THRESHOLD.

The last line shows the details of the math. In 11.2.0.4 the DoP is indeed computed based on how large is the segment (tot_io_size=79MB) and how fast we can read from disk (maxpmbps=10MB/s) so 79 (MB) / 10 (MB/s) = (time=) 7897ms. The value is then divided by 10k (unit=, not sure where the value comes from since it doesn’t seem related to PARALLEL_MIN_TIME_THRESHOLD even though it does represent 10s) and the result (0.7) considered as DoP.
DoP 0.7 means this execution will be serial.
The value of dop=2 -> maxdop=2 is a little misleading here (maybe it’s a default in the code), PX is rejected here and the SQL goes serial.
The trace confirms it just few lines below

AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 1.
  Cost_io:  2736.00  Cost_cpu: 74005179.68
  Card:     10000.00  Bytes:    60070000.00
  Cost:     2737.52  Est_time:  7897ms
kkopqSetDopReason: Reason why we chose this DOP is: no expensive parallel operation.
No parallel scan operations
kkopqCombineDop: Dop:1 Hint:no 
Query: compute:yes forced:no  computedDop:2 forceDop:0
kkopqSetDopReason: Reason why we chose this DOP is: parallel threshold.
Serial phase is inexpensive (2738), stick to it.
Costing completed. DOP chosen: 1.

Here we can see the DoP chosen was 1 because no operation was expensive enough to warrant for PX as well as the E-Time for the SQL (7.8s) is below the PARALLEL_MIN_TIME_THRESHOLD (10s).

Nothing new so far, let’s look into the 10053 from 12.1.0.2 for the same SQL, Q1 as see how is that different.

SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T10K[T10K]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Table: T10K  Alias: T10K
    Card: Original: 10000.000000  Rounded: 10000  Computed: 10000.000000  Non Adjusted: 10000.000000
  Scan IO  Cost (Disk) =   2736.000000
  Scan CPU Cost (Disk) =   74005179.680000
  Total Scan IO  Cost  =   2736.000000 (scan (Disk))
                       =   2736.000000
  Total Scan CPU  Cost =   74005179.680000 (scan (Disk))
                       =   74005179.680000
  Access Path: TableScan
    Cost:  2737.397801  Resp: 2737.397801  Degree: 0
      Cost_io: 2736.000000  Cost_cpu: 74005180
      Resp_io: 2736.000000  Resp_cpu: 74005180
  Best:: AccessPath: TableScan
         Cost: 2737.397801  Degree: 1  Resp: 2737.397801  Card: 10000.000000  Bytes: 0.000000

The first difference is the AutoDOP computation does not happen during the access path decision, that’s why no AutoDOP tag is present in the previous extract.

The calculations are performed at the end of the join selection for the query block (no join in this case and just one query block) and here is where the major changes shows up (steps are marked with different colors to make it easier to reference them):

kkecComputeAPDopCS: CPUtabBytes: 82714624.000000 CPUtabRows: 10000.000000  rowSize: 6007
kkecComputeAPDopCS: Basic Stats Level
kkecComputeDOP: Proc Rate: 1048576000.000000 - Cost: 82714624.000000 - Time: 78.882812
kkecComputeDOP: Proc Rate: 1000000.000000 - Cost: 10000.000000 - Time: 10.000000
kkecComputeAPDopCS: Before Filter Preds - CPU Dop: 0.007888
kkecComputeAPDopCS: Final Filter Predicate Bytes: 0.000000 Rows: 0.000000
kkecComputeAPDop: Final -  CPU Dop: 0.007888 Rate: 1000.000000
AutoDOP: Consider caching for T10K[T10K](obj#138272)
cost:2737.397801 blkSize:8192 objSize:10097.00 marObjSize:9592.15 bufSize:41160.00 affPercent:80 smallTab:NO affinitized:YES
kkeCostToTime: using io calibrate stats maxpmbps=10(MB/s)
 block_size=8192 mb_io_count=4 mb_io_size=30247 (bytes)
 tot_io_size=79(MB) time=7896(ms)
AutoDOP:parallel operation is set to NO
AutoDOP:parallel operation is set to NO
AutoDOP: Table/Index(#138272) access (scan) cost=2737.397801 estTime=7896.34 unit=10000.00 dop=2.000 -> maxdop=2
kkecComputeAPDop: IO Dop: 2.000000 - CPU Dop: 0.007888

The first big difference is the AutoDOP in 12c also includes the CPU DoP for the operations in the query block, including not only those operations that have an IO cost (i.e. a FTS) but also those operations that have just a CPU cost (i.e. a Join).
The way the operations are normalized is via “processing rates”, in 12c there is a new view that provides to the CBO with such numbers, V$OPTIMIZER_PROCESSING_RATE.

SQL> select operation_name, to_number(manual_value) manual, to_number(calibration_value) calibration, to_number(default_value) default_ from v$optimizer_processing_rate order by 1;

OPERATION_NAME                MANUAL CALIBRATION   DEFAULT_
------------------------- ---------- ----------- ----------
AGGR                             560                   1000
ALL                                                     200
CPU                                                     200
CPU_ACCESS                                              200
CPU_AGGR                                                200
CPU_BYTES_PER_SEC                560                   1000
CPU_FILTER                                              200
CPU_GBY                                                 200
CPU_HASH_JOIN                                           200
CPU_IMC_BYTES_PER_SEC                                  2000
CPU_IMC_ROWS_PER_SEC                                2000000
CPU_JOIN                                                200
CPU_NL_JOIN                                             200
CPU_RANDOM_ACCESS                                       200
CPU_ROWS_PER_SEC                                    1000000
CPU_SEQUENTIAL_ACCESS                                   200
CPU_SM_JOIN                                             200
CPU_SORT                                                200
HASH                                                    200
IO                               560                     10
IO_ACCESS                        560                     10
IO_BYTES_PER_SEC                                         10
IO_IMC_ACCESS                                          1000
IO_RANDOM_ACCESS                3000                     10
IO_ROWS_PER_SEC                                     1000000
IO_SEQUENTIAL_ACCESS             560                     10
MEMCMP                                                  500
MEMCPY                           560                   1000

Values can be manually set using DBMS_STATS.SET_PROCESSING_RATE but they are silently ignored until the hidden parameter _OPTIMIZER_PROC_RATE_SOURCE is not set to MANUAL. Also they can be gathered using DBMS_STATS.GATHER_PROCESSING_RATE.
The value for an operation is inherited from its parent in case is missing (the hierarchy is visible in X$OPTIM_CALIB_STATS where each STATID_KKECSTATS has its parent is PSTATID_KKECSTATS).

The first line in green (two sections above) provides info about the table we want to scan: size in bytes, number of rows and avg row lenght.
The lines in orange is where the CPU DoP is computed, my understanding is here the CPU DoP is the greatest value (expressed in ms) between the “size in bytes / CPU_BYTES_PER_SEC (expressed in MB)” and “number of rows / CPU_ROWS_PER_SEC”, so in this case it’s 10 vs 78.8 = 78.8, which is then divided by 10k (unit=) giving a CPU DoP of 0.0078. The values used for the computation are also reported in the OTHER_XML column.
The lines in blue are the same as the 11.2.0.4 one, where the IO DoP is computed (again with the misleading value of 2).
The final line in red is where IO DoP (misleading here) and CPU DoP are combined together.

As for the 11.2.0.4 case, this SQL doesn’t quality for parallel execution because too fast (the second NO in the blue lines).

AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 1.
  Cost_io:  2736.000000  Cost_cpu: 74005180
  Card:     10000.000000  Bytes:    60070000.000000
  Cost:     2737.397801  Est_time:  7896ms
kkopqSetDopReason: Reason why we chose this DOP is: no expensive parallel operation.
No parallel scan operations
kkopqCombineDop: Dop:1 Hint:no
Query: compute:yes forced:no  scanDop:2 cpuDop:1 forceDop:0
kkopqSetDopReason: Reason why we chose this DOP is: parallel threshold.
Serial phase is inexpensive (2737.397801), stick to it.
Costing completed. DOP chosen: 1.

##### Q2 #####

Let’s now focus on Q2, which runs on the larger table T50K.
From 11.2.0.4

SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for T50K[T50K] 
  Table: T50K  Alias: T50K
    Card: Original: 50000.000000  Rounded: 50000  Computed: 50000.00  Non Adjusted: 50000.00
  Access Path: TableScan
    Cost:  13793.68  Resp: 13793.68  Degree: 0
      Cost_io: 13786.00  Cost_cpu: 372959932
      Resp_io: 13786.00  Resp_cpu: 372959932
kkeCostToTime: using io calibrate stats 
 maxmbps=5(MB/s) maxpmbps=10(MB/s) 
 block_size=8192 mb_io_count=4 mb_io_size=30247 (bytes) 
 tot_io_size=398(MB) time=39789(ms)
AutoDOP: Consider caching for T50K[T50K](obj#70601) 
cost:13793.68 blkSize:8192 objSize:50897.00 marObjSize:48352.15 bufSize:37240.00 affPercent:80 smallTab:NO affinitized:NO
AutoDOP:parallel operation is set to NO
AutoDOP:parallel operation is set to YES
AutoDOP: Table/Index(#70601) access (scan) cost=13793.68 estTime=39789.45 unit=10000.00 dop=3 -> maxdop=3
  Best:: AccessPath: TableScan
         Cost: 13793.68  Degree: 1  Resp: 13793.68  Card: 50000.00  Bytes: 0

Table this time is around 398MB so 398 / 10 = 39789ms which is larger than 10 seconds (PARALLEL_MIN_TIME_THRESHOLD) hence the SQL is candidate to run using PX, reported in the second “AutoDOP: parallel operation is set to YES” (again this is a guess). The DoP is computed as usual with time/unit that is 3.9 truncated to 3.

AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 1.
  Cost_io:  13786.00  Cost_cpu: 372959931.68
  Card:     50000.00  Bytes:    300400000.00
  Cost:     13793.68  Est_time:  39789ms
kkopqCombineDop: Dop:3 Hint:no 
Query: compute:yes forced:no  computedDop:3 forceDop:0
Serial plan is expensive enough to be a candidate for parallelism (13794)
Signal reparse with DOP 3.

the SQL is then re-parsed and the plan with DoP 3 is found as cheaper than the serial one, hence accepted.

AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 3.
  Cost_io:  5105.93  Cost_cpu: 3888888.89
  Card:     50000.00  Bytes:    300400000.00
  Cost:     5106.01  Est_time:  14729ms
Comparing plan with dop=3 and plan with dop=1:
  dop=3 io=5105.93 cpu=3888889 cost=5106.01 card=50000 bytes=300400000 -> est=14729ms, scaled cost=7659.01
  dop=1 io=13786.00 cpu=372959932 cost=13793.68 card=50000 bytes=300400000 -> est=39789ms, scaled cost=13793.68
Plan with dop 3 is better. Scalability: 50
Current plan with dop=3 is better than best plan with dop=1
Costing completed. DOP chosen: 3.

Again, nothing new so far. Let’s now take a look at 12.1.0.2 for Q2:

kkecComputeAPDopCS: CPUtabBytes: 416948224.000000 CPUtabRows: 50000.000000  rowSize: 6008
kkecComputeAPDopCS: Basic Stats Level
kkecComputeDOP: Proc Rate: 1048576000.000000 - Cost: 416948224.000000 - Time: 397.632812
kkecComputeDOP: Proc Rate: 1000000.000000 - Cost: 50000.000000 - Time: 50.000000
kkecComputeAPDopCS: Before Filter Preds - CPU Dop: 0.039763
kkecComputeAPDopCS: Final Filter Predicate Bytes: 0.000000 Rows: 0.000000
kkecComputeAPDop: Final -  CPU Dop: 0.039763 Rate: 1000.000000
AutoDOP: Consider caching for T50K[T50K](obj#138273) 
cost:13793.044423 blkSize:8192 objSize:50897.00 marObjSize:48352.15 bufSize:41160.00 affPercent:80 smallTab:NO affinitized:NO
kkeCostToTime: using io calibrate stats maxpmbps=10(MB/s) 
 block_size=8192 mb_io_count=4 mb_io_size=30247 (bytes) 
 tot_io_size=398(MB) time=39788(ms)
AutoDOP:parallel operation is set to NO
AutoDOP:parallel operation is set to YES
AutoDOP: Table/Index(#138273) access (scan) cost=13793.044423 estTime=39787.63 unit=10000.00 dop=3.000 -> maxdop=3
kkecComputeAPDop: IO Dop: 3.000000 - CPU Dop: 0.039763

Little higher CPU DoP but still extremely low (0.039) and same IO DoP as 11.2.0.4, hence same DoP computed here as well.

AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 1.
  Cost_io:  13786.000000  Cost_cpu: 372959932
  Card:     50000.000000  Bytes:    300400000.000000
  Cost:     13793.044423  Est_time:  39788ms
kkopqCombineDop: Dop:3 Hint:no 
Query: compute:yes forced:no  scanDop:3 cpuDop:1 forceDop:0
Serial plan is expensive enough to be a candidate for parallelism (13793)
Signal reparse with DOP 3.
.....
AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 3.
  Cost_io:  5105.925926  Cost_cpu: 3888889
  Card:     50000.000000  Bytes:    300400000.000000
  Cost:     5105.999379  Est_time:  14729ms
kkopqCombineDop: Dop:3 Hint:no 
Query: compute:yes forced:no  scanDop:3 cpuDop:1 forceDop:0
Adopt new plan with dop=3
Costing completed. DOP chosen: 3.

So the new algorithm lead to the same DoP (which is good, no surprises) because the CPU DoP was too little to make a dent.

##### Q3 #####

Let’s now look into Q3 that is where things start to change and get a little more tricky (to make things easier to report I turned off _optimizer_distinct_agg_transform).

From 11.2.0.4

SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T16M[T16M]
  Table: T16M  Alias: T16M
    Card: Original: 16000000.000000  Rounded: 16000000  Computed: 16000000.00  Non Adjusted: 16000000.00
  Access Path: TableScan
    Cost:  9697.62  Resp: 9697.62  Degree: 0
      Cost_io: 9643.00  Cost_cpu: 2653501900
      Resp_io: 9643.00  Resp_cpu: 2653501900
kkeCostToTime: using io calibrate stats
 maxmbps=5(MB/s) maxpmbps=10(MB/s)
 block_size=8192 mb_io_count=4 mb_io_size=30247 (bytes)
 tot_io_size=280(MB) time=27974(ms)
AutoDOP: Consider caching for T16M[T16M](obj#70603)
cost:9697.62 blkSize:8192 objSize:35597.00 marObjSize:33817.15 bufSize:37240.00 affPercent:80 smallTab:NO affinitized:NO
AutoDOP:parallel operation is set to NO
AutoDOP:parallel operation is set to YES
AutoDOP: Table/Index(#70603) access (scan) cost=9697.62 estTime=27973.89 unit=10000.00 dop=2 -> maxdop=2
  Best:: AccessPath: TableScan
         Cost: 9697.62  Degree: 1  Resp: 9697.62  Card: 16000000.00  Bytes: 0

so IO DoP is 2 and E-Time larger than 10 secs so we decide to go PX

AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 1.
  Cost_io:  9643.00  Cost_cpu: 2653501899.68
  Card:     1.00  Bytes:    5.00
  Cost:     9697.62  Est_time:  27974ms
kkopqCombineDop: Dop:2 Hint:no
Query: compute:yes forced:no  computedDop:2 forceDop:0
Serial plan is expensive enough to be a candidate for parallelism (9698)
Signal reparse with DOP 2.
....
AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 2.
  Cost_io:  5357.22  Cost_cpu: 1333333333.33
  Card:     1.00  Bytes:    5.00
  Cost:     5384.67  Est_time:  15533ms
Comparing plan with dop=2 and plan with dop=1:
  dop=2 io=5357.22 cpu=1333333333 cost=5384.67 card=1 bytes=5 -> est=15533ms, scaled cost=5384.67
  dop=1 io=9643.00 cpu=2653501900 cost=9697.62 card=1 bytes=5 -> est=27974ms, scaled cost=9697.62
Plan with dop 2 is better. Scalability: 50
Current plan with dop=2 is better than best plan with dop=1
Costing completed. DOP chosen: 2.

so 11.2.0.4 re-parsed the SQL and decided to go with DoP = 2 because of the IO DoP.

In 12.1.0.2 this time the CPU DoP makes a difference, from the 10053

kkecComputeAPDopCS: CPUtabBytes: 290586624.000000 CPUtabRows: 16000000.000000  rowSize: 5
kkecComputeAPDopCS: Basic Stats Level
kkecComputeDOP: Proc Rate: 1048576000.000000 - Cost: 290586624.000000 - Time: 277.125000
kkecComputeDOP: Proc Rate: 1000000.000000 - Cost: 16000000.000000 - Time: 16000.000000
kkecComputeAPDopCS: Before Filter Preds - CPU Dop: 1.600000
kkecComputeAPDopCS: Final Filter Predicate Bytes: 0.000000 Rows: 0.000000
kkecComputeAPDop: Final -  CPU Dop: 1.600000 Rate: 1000.000000
AutoDOP: Consider caching for T16M[T16M](obj#138277) 
cost:9658.102216 blkSize:8192 objSize:35472.00 marObjSize:33698.40 bufSize:41160.00 affPercent:80 smallTab:NO affinitized:NO
kkeCostToTime: using io calibrate stats maxpmbps=10(MB/s) 
 block_size=8192 mb_io_count=4 mb_io_size=30247 (bytes) 
 tot_io_size=279(MB) time=27860(ms)
AutoDOP:parallel operation is set to NO
AutoDOP:parallel operation is set to YES
AutoDOP: Table/Index(#138277) access (scan) cost=9658.102216 estTime=27859.91 unit=10000.00 dop=2.000 -> maxdop=2
kkecComputeAPDop: IO Dop: 2.000000 - CPU Dop: 1.600000
Transfer optimizer annotations for T16M[T16M]
kkecComputeGbyObyAggrDOP: rowSize: 5.000000
kkecComputeGbyObyAggrDOP: gbyRows: 16000000.000000 gbyBytes: 80000000.000000
kkecComputeGbyObyAggrDOP: obyRows: 16000000.000000 obyBytes: 80000000.000000
kkecComputeGbyObyAggrDOP: before aggr Gby - IO Dop: 0.000000  CPU Dop: 0.000000
kkecComputeDOP: Proc Rate: 1048576000.000000 - Cost: 80000000.000000 - Time: 76.293945
kkecComputeDOP: Proc Rate: 1000000.000000 - Cost: 16000000.000000 - Time: 16000.000000
kkecComputeGbyObyAggrDOP: Basic level Aggr DOP 1.600000
kkecComputeGbyObyAggrDOP: Gby - IO Dop: 0.000000 CPU Dop: 1.600000 GbyRate: 1000.000000
kkecComputeGbyObyAggrDOP: Oby - IO Dop: 0.000000 CPU Dop: 0.000000  ObyRate: 0.000000

the important parts from the extract above are a CPU DoP of 1.6 and an IO DoP of 2 because of the access to table T16M  plus a Aggr DoP of 1.6 because of the count(distinct ).
Since E-Time > 10 secs we decide to go PX and trigger a reparse

AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 1.
  Cost_io:  9608.000000  Cost_cpu: 2652611720
  Card:     1.000000  Bytes:    5.000000
  Cost:     9658.102216  Est_time:  27860ms
kkopqCombineDop: Dop:2 Hint:no 
Query: compute:yes forced:no  scanDop:2 cpuDop:1 forceDop:0
Serial plan is expensive enough to be a candidate for parallelism (9658)
Signal reparse with DOP 2.

Now this is where things get a little tricky and my understanding could easily be even more wrong than before 😀

AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 2.
  Cost_io:  5337.777778  Cost_cpu: 1333333333
  Card:     1.000000  Bytes:    5.000000
  Cost:     5362.961620  Est_time:  15470ms
kkopqCombineDop: Dop:3 Hint:no 
Query: compute:yes forced:no  scanDop:2 cpuDop:3 forceDop:0
Adopt new plan with dop=2
Signal reparse with DOP 3.
....
AUTO DOP PLANS EVALUATION
***************************************
Compilation completed with Dop: 3.
  Cost_io:  3558.518519  Cost_cpu: 888888889
  Card:     1.000000  Bytes:    5.000000
  Cost:     3575.307747  Est_time:  10313ms
kkopqCombineDop: Dop:3 Hint:no 
Query: compute:yes forced:no  scanDop:2 cpuDop:3 forceDop:0
Adopt new plan with dop=3
Costing completed. DOP chosen: 3.

I omitted the details of the DoP trying to keep this extremely long post just a little shorter.
The three green lines in the previous two sections show the IO DoP and the CPU DoP, the IO DoP is consistently 2 but the CPU DoP starts as 1 and then it becomes 3.
Starting from the first re-parse the IO DoP is not computed anymore as it would provide no benefit since the table T16M is still large the same, the CBO is carrying on annotations (some details of what an annotation is is explained here, but nothing to do with that bug) on the pre-computed DoPs

Transfer optimizer annotations for T16M[T16M]
AutoDOP: dop(io):2.000
AutoDOP: rate(cpu):1000.000000 dop(cpu):1.600

and I think at this point the Aggr DoP (1.6) is added to the CPU DoP from T16M to come up with a CPU DoP of 3.
I’ve to admit I’m not sure why this isn’t done at the first iteration, maybe just to save time since a large IO DoP is enough “to get started” with a reparse.

This is where the new AutoDOP starts to behave differently and return a different DoP for a SQL (3 instead of 2).

I haven’t seen enough real life cases on AutoDOP in 12c to say if it works better than before but for sure it seems more aware of the whole plan and it factors in additional steps (i.e. those CPU only) trying to come up with a more accurate DoP.
I suspect most of the SQLs won’t have a different DoP just because most of the large parallel execution are constrained by IO scan rather than large CPU operations but it’s exactly on those environments with large sorts or aggregation that probably the new AutoDOP can surprise you, for the good or the bad 😉


1 Comment

How to get a SQL Testcase in a single step

Regardless of the reason why you are playing with a specific SQL (tune the SQL, avoid an ORA error, work around a wrong result, etc) it’s always helpful to be able to reproduce the same behavior in some other environment so that you can break everything with no impact on the application or the users.

Creating a SQL Testcase requires put together quite a lot of info and it can be very time consuming, SQLTXPLAIN (SQLT) does an amazing job at that (details here) on top of a lot of other things. Anyway SQLT isn’t always around plus it’s not just a TC generator, SQLT comes with a lot of other info and it might take some time to extract them.

Starting 11gR1 Oracle introduced Testcase Builder (TCB) as part of the Oracle Database Fault Diagnosability Infrastructure (ADRCI, DBMS_HM and DBMS_SQLDIAG just to keep it simple). Basically it’s a set of APIs to generate a testcase starting from either a SQL ID or a SQL text.

The easiest way to extract a TC is to use EXPORT_SQL_TESTCASE passing a SQL_ID.

SQL> select sql_id, sql_text from v$sql where sql_text = 'select * from test_tcb';

SQL_ID        SQL_TEXT
------------- -------------------------------
czq6v85ut47uq select * from test_tcb

SQL> var c1 clob

SQL> begin
 dbms_sqldiag.export_sql_testcase(directory => 'TCB_DIR',
                                  sql_id    => 'czq6v85ut47uq', 
                                  testcase  => :c1);
end;
/

All the files will have a system generated common prefix (since we didn’t specify any value for parameter testcase_name). The number of files change depending on the value for the parameters, we used all the default here so the list will look more or less like this:

oratcb1_000657400001dpexp.dmp	oratcb1_000657400001sql.xml
oratcb1_000657400001dpexp.log	oratcb1_000657400001ssimp.sql
oratcb1_000657400001dpexp.sql	oratcb1_000657400001.trc
oratcb1_000657400001dpimp.sql	oratcb1_000657400001ts.xml
oratcb1_000657400001main.xml	oratcb1_000657400001xplf.sql
oratcb1_000657400001ol.xml	oratcb1_000657400001xplo.sql
oratcb1_000657400001prmimp.sql	oratcb1_000657400001xpls.sql
oratcb1_000657400001README.txt	oratcb1_000657400001xpl.txt
oratcb1_000657400001smrpt.html

The most important ones are

  1. the README – it includes instructions on how to import the TC
  2. the main.xml – it has the reference/map to every other file
  3. the dpexp.dmp – this is the expdp of our objects (and eventually data)
  4. the xpl*.sql – those scripts allow to reply (explain or run) our SQL with/without outline

Now it’s just a matter of moving the files to another enviornment and import the TC back.

SQL> begin
 dbms_sqldiag.import_sql_testcase(directory => 'TCB_DIR',
                                  filename  =>'oratcb1_000657400001main.xml');
end;
/

SQL> select * from test_tcb;  -- or any of the xpl*.sql files

no rows selected

TCB also has the ability to provide some other useful info, in details:

  • it can export data (also just a sample of it) and PL/SQL package bodies if needed (they are usually not), both are not exported by default.
  • it can extract runtime info like Dynamic Sampling, Dynamic Plan, list of binds, etc
  • it can extract AWR reports and SQL Monitors for the SQL
  • it can export the history of the statistics since day X
  • it can run the SQL N times and capture runtime info at the end of each exec

Cherry on the cake, SQLd360 exports automatically the TC using TCB since the version v1501 so there is no additional step to perform, just need to execute SQLd360 😀

I’ve been using TCB every time I needed a TC for a while now and so far it always worked well for me, just a couple “problems” worth mentioning are

  • the ctrlOptions parameter allows a lot of flexibility but it doesn’t raise any error if the syntax is wrong (this is expected) so you don’t realize you messed up the syntax until the TC is generated and the info you wanted are missing
  • even though there is a parameter ignoreStorage in the IMPORT_SQL_TESTCASE, the storage clauses are not ignored if the mail.xml references the list of tablespaces (and it always does). The workaround is to edit the main.xml and remove the reference to the tablespaces file (ts.xml)

Least and absolutely not last, TCB has the ability to extract info directly from the kernel, which are not exposed to any SQL and PL/SQL friendly interface so it sounds like TCB will be *THE* way to extract TC in the future so better get familiar with it now 😉


1 Comment

Manually holding an enqueue with SystemTap

This week I stumbled into a system that exposed some contention on an enqueue I never saw before as a cause of troubles, the MS enqueue. The description is pretty straightforward, from V$LOCK_TYPE

SQL> select type, name, id1_tag, id2_tag, description from v$lock_type where type = 'MS';

TYP NAME                           ID1_TAG                        ID2_TAG    DESCRIPTION
--- ------------------------------ ------------------------------ ---------- -----------------------------------------------------------
MS  Materialized View Refresh Log  master object #                0          Lock held during materialized view refresh to setup MV log

so it’s easy to get an idea of which components are involved but I wanted to dig a little more around it.

Spoiler alert, this post is not about enq MS contention 🙂

I wanted to be able to hold an enqueue for as long as I wanted so that I could fire other sessions and see under which conditions they would end up waiting on the enqueue holder. An enqueue version of what is described here for latches (as usual amazing job by Tanel and Andrey!).

There are already a couple of scripts (that I found, there are probably more) for dtrace to achieve similar tasks (here and here), specially the one from Mikhail does exactly what I wanted. Anyway my VM is on Linux OEL7 and I have no dtrace to play with so I went for SystemTap.

This is not as sophisticated as the latch mechanism where using oradebug we can just grab whatever latch, in this case we need to make our session voluntary “grab the resource” and the script will just prevent the session from releasing it. Also I didn’t need all the features in Mikhail script (ie. block it only for a specific obj#) so I kept it simple to start.

I just needed to identify the enqueue for my resource and stop the process from releasing it once done with it (I’ll skip the description of the structures involved since they are already explained in both Tanel’s and Mikhail’s scripts).

The MS is X$KSIRESTYP.INDX 167 in 12.1.0.2

SQL> select rest.indx i, rest.resname type from X$KSIRESTYP rest, X$KSQEQTYP eqt where (rest.inst_id = eqt.inst_id) and   (rest.indx = eqt.indx) and   (rest.indx > 0) and rest.resname = 'MS';

         I TYP
---------- ---
       167 MS

so the script looks something like this

[oracle@localhost systemtap]$ cat trace_eq.stp
global enqAddress
global target_eq = 167;
global getTime

probe process("oracle").function("ksqgtlctx") {
     if (int_arg(5) == target_eq) {
         enqAddress = s64_arg(1);
	 getTime    = gettimeofday_us();
         printf("%d function=%s getting enqueue: pid=%d enqAddress=0x%x locktype#=%d\n", getTime, usymname(ustack(0)), pid(), enqAddress, int_arg(5));
     }
}

probe process("oracle").function("ksqr*") {
   if(s64_arg(1) == enqAddress) {
       releaseTime = gettimeofday_us();
       heldTime = releaseTime - getTime;
       printf("%d function=%s releasing enqueue: pid=%d enqAddress=0x%x held for=%d us \n", releaseTime, usymname(ustack(0)), pid(), s64_arg(1), heldTime);
       raise(%{ SIGSTOP %});
    enqAddress = 0;
   }
}

In a nutshell, the script keeps track of the enqueue address for the enqueue protecting the resource of interest (167 here) and it stops the process (SIGSTOP) once it’s trying to release it. The function uses a wildcard “ksqr*” because different functions release different enqueues (ie. MS is released by ksqrcli, TM is released by ksqrcl).

Assuming my user process is PID 4202 (sid 142), we can make it stop while holding MS just loading the stap script

[oracle@localhost systemtap]$ stap -g trace_eq.stp -x 4202

and trigger a refresh (since the enq MS “has something to do” with MV refresh from what V$LOCK_TYPE told me) from the user session pid 4202

SQL> exec dbms_snapshot.refresh('TEST_MV');

on the stap terminal you’ll see something like

1436194076252628 function=ksqgtlctx getting enqueue: pid=4202 enqAddress=0x86c38a70 locktype#=167
1436194076295783 function=ksqrcli releasing enqueue: pid=4202 enqAddress=0x86c38a70 held for=43155 us

and the session PID 4202 will indeed be stopped, while holding the MS enqueue

SQL> select * from v$lock where sid = 142;

       SID TYP        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------- --- ---------- ---------- ---------- ---------- ---------- ----------
       142 AE         133          0          4          0      69961          0
       142 MS       96149          0          6          0        352          0
       142 JI      103680          0          6          0        353          0

Now I can fire all the sessions I want and see under which conditions they get blocked, under which stack, etc etc.
To make the user session resume just CTRL+C the stap script and send a SIGCONT signal to the process (kill -s SIGCONT).

This is just a very raw script to make a process stop while holding an enqueue, it has a lot of limitations and several refinements should be implemented to match the functionalities of the other scripts mentioned before, so don’t expect much from it 🙂


3 Comments

Missing a Bloom filter…

The other day I stumbled into an execution plan that made me pause for a while so hopefully if you see something similar tomorrow you will save some time 🙂

The plan was about 100 steps long but a specific join is where I want to focus on so stripping all the “noise” I reproduce it in house with the following steps and got to a plan that looked like this .

create table mydim (filter1 number, col1 number);
create table myfact (col1 number, col2 number)
partition by range(col1)
subpartition by hash(col2)
subpartitions 16
(partition p1 values less than (5),
 partition p2 values less than (10),
 partition p3 values less than (15),
 partition p4 values less than (20),
 partition p5 values less than (25));
insert into mydim select mod(rownum,10), rownum from dual connect by rownum <= 100;
insert into myfact select mod(rownum,25), rownum from dual connect by rownum <= 100000;
commit;
exec dbms_Stats.gather_table_stats(user,'MYDIM');
exec dbms_Stats.gather_table_stats(user,'MYFACT');
alter table myfact parallel 2;

select count(*) 
  from myfact, mydim 
 where mydim.filter1 = 3 
   and mydim.col1 = myfact.col2;

--------------------------------------------------------------------------------------
| Id |Operation                          |Name    |Rows  |Cost (%CPU)| Pstart| Pstop |
--------------------------------------------------------------------------------------
|   0|SELECT STATEMENT                   |        |    1 |  558   (1)|       |       |
|   1| SORT AGGREGATE                    |        |    1 |           |       |       |
|   2|  PX COORDINATOR                   |        |      |           |       |       |
|   3|   PX SEND QC (RANDOM)             |:TQ10001|    1 |           |       |       |
|   4|    SORT AGGREGATE                 |        |    1 |           |       |       |
|*  5|     HASH JOIN                     |        |   10 |  558   (1)|       |       |
|   6|      BUFFER SORT                  |        |      |           |       |       |
|   7|       PART JOIN FILTER CREATE     |:BF0000 |   10 |    3   (0)|       |       |
|   8|        PX RECEIVE                 |        |   10 |    3   (0)|       |       |
|   9|         PX SEND BROADCAST         |:TQ10000|   10 |    3   (0)|       |       |
|* 10|          TABLE ACCESS FULL        |MYDIM   |   10 |    3   (0)|       |       |
|  11|      PX PARTITION HASH JOIN-FILTER|        |  100K|  555   (1)|:BF0000|:BF0000|
|* 12|       TABLE ACCESS FULL           |MYFACT  |  100K|  555   (1)|     1 |    80 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("MYDIM"."COL1"="MYFACT"."COL2")
  10 - filter("MYDIM"."FILTER1"=3)
  12 - filter(SYS_OP_BLOOM_FILTER(:BF0001,"MYFACT"."COL2"))

So it’s a two table join where a Bloom filter (BF0000) is created based on the data extracted from MYDIM and it’s later used to prune on MYFACT, the strange part is the FTS at step 12 uses another Bloom Filter (BF0001) that is nowhere to be found in the plan.

MYFACT is RANGE partitioned on COL1 and HASH subpartitioned on COL2 and BF0000 is used to determine which subpartitions to scan. Being the table composite partitioned combined with BF pruning the risk of getting into something like this was high, but in my case I did have an evidence of a BF applied in Pstart/Pstop, I just didn’t have any evidence of where the other BF came from.
One of the things I usually do when I can’t figure out something is prove those things that look legit first to determine what can be trusted and what not (it helps a lot when working wrong results).
Is it just a display glitch in the filter section and BF0001 should be BF0000? or is it a display glitch in the plan that is missing some steps? Is BF0000 even used to prune? Is BF0001 really used to skip rows? etc etc.

First step was to prove BF0000 was indeed used to prune. Event 10128 tracks down pruning info during execution so it’s a great way to confirm things if you can run the SQL, details here.
From one of the slaves trace (call time RUN is the only one in the slaves trace, the others are in the QC one):

Partition Iterator Information:
  partition level = ABSOLUTE
  call time = RUN
  order = ASCENDING
  Partition iterator for level 1:
   iterator = RANGE [0, 4]
   index = 0
partition iterator for level 2:
   iterator = ARRAY [count= 8, max = 16] = { 2 3 4 5 6 13 14 15 }
   index = 2

Level 1/2 is the partition/subpartition level so from the trace above we accessed all the partitions (no surprise here because there is no filter on col1) but accessed only 8 subpartitions out of 16 so looks like BF0000 was indeed used to skip partitions. When you use this trace just remember partition “IDs” are counted from 0 while partition_position in the dictionary starts from 1 so the numbers are all shifted by 1 (ie. the subpartition 2 above is indeed the 3rd one).

Next step was to verify if BF0001 is really used to filter out rows on the FTS or not so I added a comment on the SQL to make the SQL ID different and ran it again so that I could use format=>’ALLSTATS ALL’ to get aggregated stats from all the slaves+QC.

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS ALL'));
select /* B */ count(*) from myfact, mydim where mydim.filter1 = 3 and mydim.col1 = myfact.col2

-----------------------------------------------------------------------------------------
| Id |Operation                          |Name    |Starts|E-Rows |Pstart |Pstop  |A-Rows|
-----------------------------------------------------------------------------------------
|   0|SELECT STATEMENT                   |        |     1|       |       |       |     1|
|   1| SORT AGGREGATE                    |        |     1|     1 |       |       |     1|
|   2|  PX COORDINATOR                   |        |     1|       |       |       |     2|
|   3|   PX SEND QC (RANDOM)             |:TQ10001|     0|     1 |       |       |     0|
|   4|    SORT AGGREGATE                 |        |     2|     1 |       |       |     2|
|*  5|     HASH JOIN                     |        |     2|    10 |       |       |    10|
|   6|      BUFFER SORT                  |        |     2|       |       |       |    20|
|   7|       PART JOIN FILTER CREATE     |:BF0000 |     2|    10 |       |       |    20|
|   8|        PX RECEIVE                 |        |     2|    10 |       |       |    20|
|   9|         PX SEND BROADCAST         |:TQ10000|     0|    10 |       |       |     0|
|* 10|          TABLE ACCESS FULL        |MYDIM   |     1|    10 |       |       |    10|
|  11|      PX PARTITION HASH JOIN-FILTER|        |     2|   100K|:BF0000|:BF0000|    10|
|* 12|       TABLE ACCESS FULL           |MYFACT  |    80|   100K|     1 |    80 |    10|
-----------------------------------------------------------------------------------------

Each partition has 20k rows and each subpart around 1250 so in absence of BF applied the FTS would return 50k rows (1250 rows per subpart * 8 subpart * 5 part) but the A-Rows shows the number of rows was 10 (this time, during other runs sometime little higher) so it really looks like BF0001 existed and was applied.

So BFs are real and are used, time to doubt the exec plan.
The easiest way to do it was to test in a future version, the plan above came from 11.2.0.4 so it was natural to test 12.1.0.2 and surprise surprise, the plan made much more sense over there:

---------------------------------------------------------------------------------------
| Id |Operation                           |Name    |Rows  |Cost (%CPU)| Pstart| Pstop |
---------------------------------------------------------------------------------------
|   0|SELECT STATEMENT                    |        |    1 |12114   (1)|       |       |
|   1| SORT AGGREGATE                     |        |    1 |           |       |       |
|   2|  PX COORDINATOR                    |        |      |           |       |       |
|   3|   PX SEND QC (RANDOM)              |:TQ10001|    1 |           |       |       |
|   4|    SORT AGGREGATE                  |        |    1 |           |       |       |
|*  5|     HASH JOIN                      |        |   10 |12114   (1)|       |       |
|   6|      BUFFER SORT                   |        |      |           |       |       |
|   7|       JOIN FILTER CREATE           |:BF0001 |   10 |    3   (0)|       |       |
|   8|        PART JOIN FILTER CREATE     |:BF0000 |   10 |    3   (0)|       |       |
|   9|         PX RECEIVE                 |        |   10 |    3   (0)|       |       |
|  10|          PX SEND BROADCAST         |:TQ10000|   10 |    3   (0)|       |       |
|* 11|           TABLE ACCESS FULL        |MYDIM   |   10 |    3   (0)|       |       |
|  12|      JOIN FILTER USE               |:BF0001 |  100K|12111   (1)|       |       |
|  13|       PX PARTITION HASH JOIN-FILTER|        |  100K|12111   (1)|:BF0000|:BF0000|
|* 14|        TABLE ACCESS FULL           |MYFACT  |  100K|12111   (1)|     1 |    80 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("MYDIM"."COL1"="MYFACT"."COL2")
  11 - filter("MYDIM"."FILTER1"=3)
  14 - filter(SYS_OP_BLOOM_FILTER(:BF0001,"MYFACT"."COL2"))

and setting OFE=11.2.0.4 then plan switched back to the “strange” one so it was time for SQLT XPLORE starting from 12.1.0.2 and testing all the settings (parameter+fix_control) looking for one that would cause the plan “to go back to strange”. Outcome was fix for (internal) bug 12569316, unfortunately there is no backport on top of 11.2.0.4 so I couldn’t test if the fix itself is enough or it needs some other code change to display the missing step.

At least I’m now positive such step should indeed be present in the plan and in the future (12c) it’s going to be there 😀


8 Comments

Historical SQL Monitor reports in 12c!

I think SQL Monitoring is one of the greatest addition to the Oracle performance diagnostics world since a long time ago (10046-time maybe?) because it allows to collect in a single shot a long list of crucial information that are otherwise painful to extract and put together. The report provides a complete picture of the execution across processes (PX), nodes (RAC), etc etc.
On the other hand, one of the major limitations (the biggest one, imho) is SQL Monitor info are quickly aged out of memory and not stored in AWR (there is no DBA_HIST_SQL_MONITOR or similar) so they are are unlikely to be around for a post-mortem investigation.

Good news is in 12c we can pull a report for a historical execution, granted the execution was expensive enough to candidate for collection. I didn’t investigate the details of the automatic collection yet but there are two ways, details from DBA_HIST_REPORTS_CONTROL:

  • REGULAR – per-minute report capture subject to DBTIME budget
  • FULL_CAPTURE – capture will be run per minute without the DBTIME budget constraints

The FULL_CAPTURE can be enabled using DBMS_AUTO_REPORT.START_REPORT_CAPTURE/FINISH_REPORT_CAPTURE.

Info about each automatically collected report are stored in DBA_HIST_REPORTS and the report itself (in XML format) is stored in DBA_HIST_REPORTS_DETAILS.

The whole framework is also used for Real-Time ADDM so the DBA_HIST_REPORTS* views are not organized in a SQL Monitor-friendly way (ie. SQL ID/SQL Exec ID/SQL Exec Start) but rather in a report-oriented way, the key is REPORT_ID.
Column COMPONENT_NAME helps track down the source of the report, “sqlmonitor” in this case.
A summary of the report is stored in REPORT_SUMMARY in XML format, so a simple SQL like the following pulls the list of REPORT_ID/SQL_ID (plus anything else you may want to extract from the summary, ie. SQL Exec ID and SQL Exec Start):

SELECT report_id,
       EXTRACTVALUE(XMLType(report_summary),'/report_repository_summary/sql/@sql_id') sql_id,
       EXTRACTVALUE(XMLType(report_summary),'/report_repository_summary/sql/@sql_exec_id') sql_exec_id,
       EXTRACTVALUE(XMLType(report_summary),'/report_repository_summary/sql/@sql_exec_start') sql_exec_start
  FROM dba_hist_reports
 WHERE component_name = 'sqlmonitor'

From my system I have

 REPORT_ID SQL_ID          SQL_EXEC_I SQL_EXEC_START
---------- --------------- ---------- ------------------------------
      1022 fx439nus0rtcz     16777216 04/29/2015 13:34:15
      1024 fjvsmy2yujbqd     16777216 04/29/2015 13:40:00
      1025 9qn59dh1w8352     16777216 04/29/2015 13:41:12
      1026 1uqrk6t8gfny8     16777216 04/29/2015 13:41:14

Using the REPORT_ID we can now extract the report in different format (HTML, TEXT, ACTIVE, XML) using DBMS_AUTO_REPORT.REPORT_REPOSITORY_DETAIL

SQL> set long 10000000 longchunksize 10000000 pages 0
SQL> SELECT DBMS_AUTO_REPORT.REPORT_REPOSITORY_DETAIL(RID => 1022, TYPE => 'text')
       FROM dual;
SQL Monitoring Report

SQL Text
------------------------------
select count(*) from test_inmemory

Global Information
------------------------------
 Status             : DONE (ALL ROWS)
 Instance ID        : 1
 Session            : MPAGANO (12:35570)
 SQL ID             : fx439nus0rtcz
 SQL Execution ID   : 16777216
 Execution Started  : 04/29/2015 13:34:15
 First Refresh Time : 04/29/2015 13:34:15
 Last Refresh Time  : 04/29/2015 13:34:15
 Duration           : .064582s
 Module/Action      : SQL*Plus/-
 Service            : orcl
 Program            : sqlplus@Mauros-iMac.local (TNS V1-V3)
 Fetch Calls        : 1
.....

or if we want it in flashy ACTIVE format

SQL> set trimspool on
SQL> set trim on
SQL> set pages 0
SQL> set linesize 1000
SQL> set long 1000000
SQL> set longchunksize 1000000
SQL> spool historical_sqlmon.sql
SQL> SELECT DBMS_AUTO_REPORT.REPORT_REPOSITORY_DETAIL(RID => 1022, TYPE => 'active')
       FROM dual;

Screen Shot 2015-05-04 at 3.32.23 PM

Several details are missing from this post (and from my knowledge of how the whole framework works) but I’ll make sure to pass them along as soon as I find out more.

Hopefully this blog post will be useful next time you are trying to figure out why a SQL ran so slow last weekend 🙂

UPDATE: DBA_HIST_REPORTS also includes SQL_ID, SQL_EXEC_ID, SQL_EXEC_START and several other information (concatenated together into a single column and separated by ‘3#’ sign) as basic columns KEY1, KEY2, KEY3 and KEY4. The following SQL provides the same information as the original SQL posted in this blog post

SELECT report_id, key1 sql_id, key2 sql_exec_id, key3 sql_exec_start
  FROM dba_hist_reports
 WHERE component_name = 'sqlmonitor'


2 Comments

Dynamic Sampling just got better

12c brought several new features in the SQL performance area and the most advertised one is probably Adaptive Query Optimization. The Adaptive Plans feature has been described in a lot of details (specially the adaptive join methods part) and most of the focus in Adaptive Statistics has been on SQL Plan Directives and Automatic Re-optimization, not much has been said about Dynamic Statistics, aka the new dynamic sampling (DS)
Since DS is also helpful outside of the whole Adaptive Query Optimization area so let’s just focus on it.

The original DS would kick in “by default” in absence of object statistics, increasing the level would make the CBO use it even under a few other conditions (details here).
Starting 11.2 (ER 7452863) DS can dynamically auto-adjust its own level at a query level (hence kicking in even when it would have not) when the SQL runs in parallel and large tables are involved.
Basically the old DS usually does a good job in helping the CBO get good estimations *for single tables*

  • in absence of stats
  • when the level is “high” (from 4 up)
  • when the SQL runs in parallel and tables are large

What just said implies DS doesn’t help much if

  • the table has stats but the level is default
  • the SQL runs in serial
  • we are estimating anything else but single table selectivity

The “new” dynamic sampling has been introduced in 12.1.0.1 and backported to 11.2.0.4 and it’s “hidden” behind the new level 11 or AUTO (value AUTO is accepted only in 12c) and it brings a lot of very interesting features

  • DS can kick in automatically when the CBO realizes it doesn’t have too solid estimations
  • DS can be applied on joins, group by and whole query blocks (haven’t looked too much yet into the query blocks one)
  • DS results can be cached using result cache and exported transparently using TCB

Let’s play a little with an example, test is from a 12.1.0.2.
Two tables with uniform data distribution and strong correlation across columns within the same table and between tables.

SQL&gt; create table t1 as select mod(rownum, 50) n1, mod(rownum, 50) n2 from dual connect by rownum &lt;= 1000;
SQL&gt; create table t2 as select mod(rownum, 50) n1, mod(rownum, 50) n2 from dual connect by rownum &lt;= 1000;
SQL&gt; exec dbms_stats.gather_table_stats(user,'T1');
SQL&gt; exec dbms_stats.gather_table_stats(user,'T2');

By default a join selectivity/cardinality would be underestimated since there is correlation but nothing to inform the CBO about it

SQL&gt; select count(*) from t1,t2 where t1.n1 = t2.n1 and t1.n2 = t2.n2;

Plan hash value: 906334482

---------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |
|*  2 |   HASH JOIN         |      |      1 |   1000 |  20000 |  &lt;---
|   3 |    TABLE ACCESS FULL| T1   |      1 |   1000 |   1000 |
|   4 |    TABLE ACCESS FULL| T2   |      1 |   1000 |   1000 |
---------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(&quot;T1&quot;.&quot;N1&quot;=&quot;T2&quot;.&quot;N1&quot; AND &quot;T1&quot;.&quot;N2&quot;=&quot;T2&quot;.&quot;N2&quot;)

With such a simple join condition we can use a column group to fix the estimation and indeed it works fine

SQL&gt;exec dbms_stats.gather_table_stats(user,'T1', method_opt =&gt; 'FOR COLUMNS (N1,N2) SIZE AUTO');
SQL&gt;exec dbms_stats.gather_table_stats(user,'T2', method_opt =&gt; 'FOR COLUMNS (N1,N2) SIZE AUTO');

SQL&gt; select count(*) from t1,t2 where t1.n1 = t2.n1 and t1.n2 = t2.n2;

---------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |
|*  2 |   HASH JOIN         |      |      1 |  20000 |  20000 |
|   3 |    TABLE ACCESS FULL| T1   |      1 |   1000 |   1000 |
|   4 |    TABLE ACCESS FULL| T2   |      1 |   1000 |   1000 |
---------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(&quot;T1&quot;.&quot;N1&quot;=&quot;T2&quot;.&quot;N1&quot; AND &quot;T1&quot;.&quot;N2&quot;=&quot;T2&quot;.&quot;N2&quot;)

Anyway a column group can only be used with equality conditions (a column group is basically a hash of all the columns in the group) so as soon as we change our SQL then we are back on the understimated selectivity/cardinality

SQL&gt; select count(*) from t1,t2 where t1.n1 = t2.n1 and t1.n2 &gt; t2.n2;

---------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |
|*  2 |   HASH JOIN         |      |      1 |   9800 |      0 |
|   3 |    TABLE ACCESS FULL| T1   |      1 |   1000 |   1000 |
|   4 |    TABLE ACCESS FULL| T2   |      1 |   1000 |   1000 |
---------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(&quot;T1&quot;.&quot;N1&quot;=&quot;T2&quot;.&quot;N1&quot;)
       filter(&quot;T1&quot;.&quot;N2&quot;&gt;&quot;T2&quot;.&quot;N2&quot;)

So let’s turn on the new DS and check how it helps

alter session set optimizer_dynamic_sampling = 11;
select count(*) from t1,t2 where t1.n1 = t2.n1 and t1.n2 &gt; t2.n2;

---------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |
|*  2 |   HASH JOIN         |      |      1 |      1 |      0 |
|   3 |    TABLE ACCESS FULL| T1   |      1 |   1000 |   1000 |
|   4 |    TABLE ACCESS FULL| T2   |      1 |   1000 |   1000 |
---------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(&quot;T1&quot;.&quot;N1&quot;=&quot;T2&quot;.&quot;N1&quot;)
       filter(&quot;T1&quot;.&quot;N2&quot;&gt;&quot;T2&quot;.&quot;N2&quot;)

Note
-----
   - dynamic statistics used: dynamic sampling (level=AUTO)

Notice how the join cardinality is now very accurate (and as a consequence we switched to HJ).
From the 10053 and 10046 (I wrapped around the 10046 to avoid a long scrollbar)

Join Card:  9800.000000 = outer (1000.000000) * inner (1000.000000) * sel (0.009800)
Join cardinality for HJ/SMJ (no post filters):  20000.000000, outer: 1000.000000, inner: 1000.000000, sel: 0.009800
&gt;&gt; Join Card adjusted from 9800.000000 to 1.000000 due to adaptive dynamic sampling, prelen=2
Adjusted Join Cards: adjRatio=0.000102 cardHjSmj=1.000000 cardHjSmjNPF=2.040816 cardNlj=1.000000 cardNSQ=1.000000 cardNSQ_na=9800.000000
Join Card - Rounded: 1 Computed: 1.000000


PARSING IN CURSOR #140546644531144 len=332 dep=1 uid=118 oct=3 lid=118 tim=173271358375 hv=1119371978 ad='c6e539a0' sqlid='f814kux1bhhqa'
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) */ 
SUM(C1) FROM (SELECT /*+ qb_name(&quot;innerQuery&quot;) NO_INDEX_FFS( &quot;T1#1&quot;)  */ 1 AS C1 
FROM &quot;T2&quot; &quot;T2#0&quot;, &quot;T1&quot; &quot;T1#1&quot; WHERE (&quot;T1#1&quot;.&quot;N1&quot;=&quot;T2#0&quot;.&quot;N1&quot;) AND (&quot;T1#1&quot;.&quot;N2&quot;&gt;&quot;T2#0&quot;.&quot;N2&quot;)) innerQuery

Notice how the recursive DS SQL joins the two tables together, this wasn’t possible before.
At the same time we want to avoid triggering those SQLs over and over so the result is also cached

SELECT id, type, creation_timestamp, block_count, column_count, pin_count, row_count, name
  FROM V$RESULT_CACHE_OBJECTS
 WHERE cache_id = 'b1ytyjnksv3up3yjtn35bq96jx';  2    3  

	    ID TYPE       CREATION_ BLOCK_COUNT COLUMN_COUNT  PIN_COUNT  ROW_COUNT NAME
---------- ---------- --------- ----------- ------------ ---------- ---------- --------------------------------------------------------------------------------------------------------------------------------
	     4 Result     03-DEC-14 	      1	           1	      0	         1 SELECT /* DS_SVC */ /*+ cursor_sharing_exact dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) re

The previous test shows how the new DS can help in more complex scenario (joins, group by, etc) than before but then at the same time it means more time could potentially be spent on DS. For this reason the recursive SQL gets a limited amount of time after which the execution aborts as well as a default cap for tables to join.
Those conditions as well as the result cache usage can be configured playing with the following parameters

NAME				            DESCRIPTION									         SESSION_VALUE
------------------------------- ---------------------------------------------------- --------------
_optimizer_ads_max_table_count	maximum number of tables in a join under ADS		 0
_optimizer_ads_time_limit	    maximum time limit (seconds) under ADS				 0
_optimizer_ads_use_result_cache use result cache for ADS queries					 TRUE

Adaptive Plans can help in avoid picking up an incorrect join method but the join order cannot be changed at runtime but getting a more accurate estimation at parse time could encourage a more efficient join order, that’s why I like the new DS so much.

IMHO this new feature has a lot of potential because it helps in addressing exactly those areas (joins, group by) where it’s particularly challenging for the CBO to get consistently accurate estimations.