SQL Tuning | Mauro Pagano's Blog

July 30, 2017
by Mauro Pagano 4 Comments

SQL Diag Repository

Every once in a while I look around the list of new (fixed) views in the instance trying to find useful / interesting things to add to SQLd360 and recently I stumbled into V$SQL_DIAG_REPOSITORY / V$SQL_DIAG_REPOSTIORY_REASON.

SQL> desc v$sql_diag_repository
 Name               Null?    Type
 ------------------ -------- -------------
 ADDRESS                     RAW(8)
 CON_ID                      NUMBER
 HASH_VALUE                  NUMBER
 SQL_ID                      VARCHAR2(13)
 CHILD_NUMBER                NUMBER
 SQL_DIAG_REPO_ID            NUMBER
 TYPE                        VARCHAR2(20)
 PLAN_ID                     NUMBER
 FEATURE                     VARCHAR2(50)
 STATE                       VARCHAR2(10)

SQL> desc v$sql_diag_repository_reason
 Name                  Null?    Type
 --------------------- -------- -------------
 ADDRESS                        RAW(8)
 CON_ID                         NUMBER
 HASH_VALUE                     NUMBER
 SQL_ID                         VARCHAR2(13)
 CHILD_NUMBER                   NUMBER
 SQL_DIAG_REPO_ID               NUMBER
 FEATURE                        VARCHAR2(50)
 REASON                         VARCHAR2(64)
 COMPILATION_ORIGIN             VARCHAR2(1)
 EXECUTION_ORIGIN               VARCHAR2(1)
 SLAVE_ORIGIN                   VARCHAR2(1)

These views have been introduced in 12.1 but if you google them there isn’t much about the views beside just the reference as “new”.
The views don’t seem to store any data (at least in my instances) aka they seem to belong to a feature that is disabled by default.

DISCLAIMER1: Considering the very specific nature of this functionality (details coming) I can understand why this functionality is disabled by default, probably just Oracle Development / Support would make an effective use of it. Anyway in the spirit of sharing what we learn here comes the details 🙂

Looking from hidden parameters there seem to be two controlling this feature

SQL> @hparam _sql_diag

NAME                   DESCRIPTION                                          SESSION_VA SYSTEM_VAL
---------------------- ---------------------------------------------------- ---------- ----------
_sql_diag_repo_origin  duarations where sql diag repository are retained    all        all
_sql_diag_repo_retain  retain sql diag repository to cursor or not

Parameter _sql_diag_repo_origin accepts one from [all, execution, compilation, none], let’s keep it to “all”.
Parameter _sql_diag_repo_retain has an uncommon (IMHO) value of NULL, usually features have TRUE/FALSE value, let’s turn it on setting it to TRUE.

SQL> alter session set "_sql_diag_repo_retain" = true;
SQL> select count(*) from t1;
SQL> select a.child_number, a.type, a.sql_diag_repo_id, a.feature, a.state, b.reason  
  from v$sql_diag_repository a, 
       v$sql_diag_repository_reason b 
 where a.sql_id = b.sql_id 
   and a.child_number = b.child_number 
   and a.sql_diag_repo_id = b.sql_diag_repo_id 
   and a.sql_id = '5bc0v4my7dvr5';

CHILD_NUMBER TYPE     SQL_DIAG_REPO_ID FEATURE                  STATE      REASON
------------ -------- ---------------- ------------------------ ---------- -----------------------------------------
           0 ctxdef                  0 QKSFM_ACCESS_PATH        accept     cost
           0 ctxdef                  1 QKSFM_FULL               final      execution plan
           0 ctxdef                  2 QKSFM_PQ                 bypass     Parameter
           0 ctxdef                  3 QKSFM_ALL_ROWS           final      execution plan
           0 qbcdef                  4 QKSFM_STAR_TRANS         bypass     star transformation parameter is FALSE
           0 qbcdef                  5 QKSFM_CVM                bypass     Outer query contains no views
           0 qbcdef                  6 QKSFM_SET_TO_JOIN        check      checking
           0 qbcdef                  7 QKSFM_UNNEST             check      checking
           0 qbcdef                  8 QKSFM_PRED_MOVE_AROUND   bypass     Outer query contains no views
           0 qbcdef                  9 QKSFM_CBQT               bypass     invalidated
           0 qbcdef                 10 QKSFM_FULL               final      execution plan
           0 qbcdef                 11 QKSFM_TABLE_EXPANSION    bypass     No partitioned table in query block
           0 qbcdef                 12 QKSFM_JOINFAC            bypass     not a UNION or UNION-ALL query block
           0 qbcdef                 13 QKSFM_VECTOR_AGG         bypass     No group-by clause
           0 frodef                 14 QKSFM_ACCESS_PATH        accept     cost

Looking at the output I see several interest things:

FEATURE seems like the piece of optimizer code considered (same as V$SQL_HINT.SQL_FEATURE)
STATE seems to be the output of that piece of optimizer code when applied to our SQL and REASON includes a more verbose explanation (probably, just guessing)
TYPE seems to be the “scope” of that feature in this specific case (context, query block, frodef, etc)

It seems to be a (kind of) summarized version of a 10053 trace from a higher point of view. Not that useful for this SQL to be honest but at least it’s something.
Let’s try with another SQL

SQL> select count(*) from t1 where object_type in (select object_type from t1);

-------------------------------------------------------------------------------------
| Id  | Operation		    | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |	   |	    |	    |  1049 (100)|	    |
|   1 |  SORT AGGREGATE 	    |	   |	  1 |	 10 |		 |	    |
|*  2 |   HASH JOIN SEMI	    |	   |  97109 |	948K|  1049   (1)| 00:00:01 |
|*  3 |    TABLE ACCESS STORAGE FULL| T1   |  97109 |	474K|	524   (1)| 00:00:01 |
|*  4 |    TABLE ACCESS STORAGE FULL| T1   |  97109 |	474K|	524   (1)| 00:00:01 |
-------------------------------------------------------------------------------------

SQL> select a.child_number, a.type, a.sql_diag_repo_id, a.feature, a.state, b.reason, b.compilation_origin, b.execution_origin, b.slave_origin 
  from v$sql_diag_repository a, 
       v$sql_diag_repository_reason b 
 where a.sql_id = b.sql_id 
   and a.child_number = b.child_number 
   and a.sql_diag_repo_id = b.sql_diag_repo_id 
   and a.sql_id = '9qnqssz6t90n7';

CHILD_NUMBER TYPE        SQL_DIAG_REPO_ID FEATURE                   STATE      REASON                                                           C E S
------------ ----------- ---------------- ------------------------- ---------- ---------------------------------------------------------------- - - -
           0 ctxdef                     0 QKSFM_JPPD                accept     all check pass and valid                                         Y
           0 ctxdef                     1 QKSFM_UNNEST              final      execution plan                                                   Y
           0 ctxdef                     2 QKSFM_ACCESS_PATH         accept     cost                                                             Y
           0 ctxdef                     3 QKSFM_FULL                final      execution plan                                                   Y
           0 ctxdef                     4 QKSFM_USE_HASH            final      execution plan                                                   Y
           0 ctxdef                     5 QKSFM_JOIN_ORDER          final      execution plan                                                   Y
           0 ctxdef                     6 QKSFM_PQ                  bypass     Parameter                                                        Y
           0 ctxdef                     7 QKSFM_ALL_ROWS            final      execution plan                                                   Y
           0 qbcdef                     8 QKSFM_STAR_TRANS          bypass     star transformation parameter is FALSE                           Y
           0 qbcdef                     9 QKSFM_CVM                 bypass     Outer query contains no views                                    Y
           0 qbcdef                    10 QKSFM_JPPD                accept     all check pass and valid                                         Y
           0 qbcdef                    11 QKSFM_SET_TO_JOIN         check      checking                                                         Y
           0 qbcdef                    12 QKSFM_UNNEST              valid      passed validity checks                                           Y
           0 qbcdef                    13 QKSFM_PRED_MOVE_AROUND    bypass     Outer query contains no views                                    Y
           0 qbcdef                    14 QKSFM_CBQT                valid      passed validity checks                                           Y
           0 qbcdef                    15 QKSFM_FULL                final      execution plan                                                   Y
           0 qbcdef                    16 QKSFM_USE_HASH            final      execution plan                                                   Y
           0 qbcdef                    17 QKSFM_JOIN_ORDER          final      execution plan                                                   Y
           0 qbcdef                    18 QKSFM_TABLE_EXPANSION     bypass     No partitioned table in query block                              Y
           0 qbcdef                    19 QKSFM_JOINFAC             bypass     not a UNION or UNION-ALL query block                             Y
           0 qbcdef                    20 QKSFM_VECTOR_AGG          bypass     No group-by clause                                               Y
           0 qbcdef                    21 QKSFM_STAR_TRANS          bypass     star transformation parameter is FALSE                           Y
           0 qbcdef                    22 QKSFM_CVM                 check      checking                                                         Y
           0 qbcdef                    23 QKSFM_UNNEST              accept     all check pass and valid                                         Y
           0 qbcdef                    24 QKSFM_TABLE_EXPANSION     bypass     No partitioned table in query block                              Y
           0 qbcdef                    25 QKSFM_JOINFAC             bypass     not a UNION or UNION-ALL query block                             Y
           0 qbcdef                    26 QKSFM_VECTOR_AGG          bypass     No group-by clause                                               Y
           0 qcUnknown                 27 QKSFM_CVM                 check      checking                                                         Y
           0 qcUnknown                 28 QKSFM_UNNEST              valid      passed validity checks                                           Y
           0 qcUnknown                 29 QKSFM_UNNEST              accept     all check pass and valid                                         Y
           0 frodef                    30 QKSFM_ACCESS_PATH         accept     cost                                                             Y
           0 frodef                    31 QKSFM_ACCESS_PATH         accept     cost                                                             Y

Considering the execution plan and the output from V$SQL_DIAG_REPOSITORY I think it’s interesting that the only transformation applied (at least the only “large” one), subquery unnesting, is reported as “valid” in the STATE column.
This should allow to summarize what happened (from a CBO transformation perspective) to large SQL statements with just a simple SQL query after the parse is over

SQL> SELECT e1.email, jh.job_id
  FROM employees e1,
       job_history jh
 WHERE e1.employee_id = jh.employee_id
   AND jh.start_date > '01-JAN-01'
   AND e1.salary > (SELECT /*+ QB_NAME(SQ1) */ AVG(e2.salary)
                      FROM employees e2
                     WHERE e1.department_id = e2.department_id)
   AND e1.department_id IN (SELECT /*+ QB_NAME(SQ2) */ d.department_id 
                              FROM departments d,
                                   locations l
                             WHERE d.location_id = l.location_id
                               AND l.country_id = 'US');
SQL> ... (same SQL as before on V$SQL_DIAG_REPOSITORY/REASON)
CHILD_NUMBER TYPE       SQL_DIAG_REPO_ID FEATURE                  STATE      REASON                   C E S
------------ ---------- ---------------- ------------------------ ---------- ------------------------ - - -
           0 qbcdef                   19 QKSFM_PRED_MOVE_AROUND   valid      checking                 Y
           0 qbcdef                   41 QKSFM_PRED_MOVE_AROUND   valid      passed validity checks   Y
           0 qbcdef                   42 QKSFM_CBQT               valid      passed validity checks   Y
           0 qcUnknown                57 QKSFM_CVM                valid      default reason           Y
           0 qcUnknown                64 QKSFM_UNNEST             valid      passed validity checks   Y
           0 qcUnknown                70 QKSFM_UNNEST             valid      passed validity checks   Y
           0 qcUnknown                71 QKSFM_JPPD               valid      passed validity checks   Y
           0 qcUnknown                73 QKSFM_JPPD               valid      passed validity checks   Y
           0 qcUnknown                92 QKSFM_UNNEST             valid      passed validity checks   Y
           0 qcUnknown                93 QKSFM_CVM                valid      default reason           Y
           0 qcUnknown                98 QKSFM_UNNEST             valid      passed validity checks   Y
           0 qcUnknown               103 QKSFM_UNNEST             valid      passed validity checks   Y
           0 qcUnknown               104 QKSFM_JPPD               valid      passed validity checks   Y
           0 qcUnknown               116 QKSFM_UNNEST             valid      passed validity checks   Y
           0 qcUnknown               119 QKSFM_CVM                valid      default reason           Y
           0 qcUnknown               122 QKSFM_UNNEST             valid      passed validity checks   Y
           0 qcUnknown               124 QKSFM_JPPD               valid      passed validity checks   Y
           0 qcUnknown               130 QKSFM_UNNEST             valid      passed validity checks   Y
           0 qcUnknown               134 QKSFM_CVM                valid      default reason           Y
           0 qcUnknown               138 QKSFM_CVM                valid      default reason           Y
           0 qcUnknown               145 QKSFM_JPPD               valid      passed validity checks   Y

In this specific case the only transformation that was accepted and used by the SQL was the subquery unnesting of QB1 (for details of why, check here) but the output of the SQL on V$SQL_DIAG_REPOSITORY seems to show all the transformations that have been considered (details in the same previous link).

One interesting consequence of turning on this feature is the specific SQL Plan Directives used during execution are recorded too (in the OTHER_XML), just like they are when you issue an EXPLAIN PLAN FOR

SQL> alter session set "_sql_diag_repo_retain" = true;
SQL> select count(*) from tab1 where f1 = 1 and f2 = 1;
SQL> select * from table(dbms_xplan.display_cursor('gx8kuzk9q2m7z',0,'ADVANCED'));

.... 

Sql Plan Directive information:
-------------------------------
  Used directive ids:
    5379930755864887256

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement

As usual, corrections, suggestions, feedbacks, etc are very much welcome!

DISCLAIMER2: I’m not sure of the overhead / risk associated (memory corruption?) with using this feature in the wild so to stay on the safe side I would probably use it just in a lower environment.

May 28, 2017
by Mauro Pagano 3 Comments

Quick and dirty Pathfinder (or SQLT XPLORE)

The other day I was working on a SQL with an odd plan (JPPD with pushed predicate not on the driver table inside the view) when as a test I flipped OFE back one version and got the plan I was expecting, this is (one of) the typical use case(s) for Pathfinder or SQLT XPLORE.

I didn’t have a reproducible testcase and while creating it is always a good thing (IMHO), I was working in a lower environment that gave me a degree of freedom in testing things. I knew exactly which release to go from (12.1.0.1 -> 12.1.0.2) so I wrote a few lines of PL/SQL to implement a smaller Pathfinder. The idea was to let it run while I was working on a testcase to emulate the problem (and maybe run full blown Pathfinder on it).

The goal of this post isn’t really to provide a working set of files but rather to give you an idea of what those tools do so that you have a better understanding of what to expect from them 🙂
Here is the code, comments later

set serveroutput on timing off feed off veri off
spo driver.sql
exec dbms_output.put_line('set echo on');
begin

  for i in (select bugno fix_control, CASE WHEN value = 0 THEN 1 ELSE 0 END new_value
              from v$system_fix_control
             where optimizer_feature_enable = '&&1.'
               and value in (0,1)) LOOP

        dbms_output.put_line('CONN <>/<>@<>');
        dbms_output.put_line(q'[ALTER SESSION SET "_fix_control" = ']'||i.fix_control||':'||i.new_value||''';');
        dbms_output.put_line('@q');
        dbms_output.put_line('@x');

   end loop;

end;
/                                                                                                                                                                                                               
spo off
spo result.txt
@driver.sql
spo off

The idea is simple (and partially flawed), iterate over the fixes that have been introduced in the version you are testing on and write a “test” for each of them (to be tested individually), one at a time. Two support scripts are q.sql that includes SQL to test (with binds if needed) and x.sql to extract the plan from DBMS_XPLAN.DISPLAY_CURSOR.
The flawed idea raises from the fact the SQL does not consider changes delivered under parameters (usually features rather than fixes) as well as it ignores those fixes that are configurable more than [ON/OFF]. Also there are some fixes with questionable versions tracked under OFE, this screws up the test as well.
Both of those conditions explained above make up the dirty part 🙂

The block above creates another script, driver.sql, that is than executed. Between 12.1.0.1 and 12.1.0.2 there were 107 fixes.

The CONNECT over and over is an overkill in this case, but experience on XPLORE and Pathfinder taught me there are some parameter changes that don’t trigger a new parse (even though they should) or they don’t get completely “cleaned” when setting them back (what just described isn’t very accurate though so I wouldn’t mind somebody scolding me for it!).

Pathfinder does something very similar, except it iterates on every parameter related to CBO plus all fix_controls, thus executing a much larger number of tests. On top of that it takes into account Cardinality Feedback, executing the SQL multiple times until the plan stop changing (assuming it changes).

Feedbacks, correction, etc welcome as usual.

Oh btw, those few lines of code helped me find out the fix I was after 😀

May 10, 2017
by Mauro Pagano Leave a comment

Bushy Joins – a closer look

When 12.2 came out most of the (optimizer) focus was around SPD and how to avoid the challenges from 12.1. Still 12.2 introduced several (less acclaimed) optimizations including “Bushy Join” transformation, which is interesting since (I think, corrections welcome) Bushy Join concept isn’t necessarily tied to query transformation in general, especially before 12.2 (some reference about “manual” bushy joins here and here) or in other RDBMS (a manual example on SQL Server here).
Anyway being the CBO way of improving our code query transformations here we go again.

There isn’t much on the internet about Bushy Joins and 12.2 beside this article so I decided to take a closer look. All the tests are from a 12.2 vanilla installation with bushy joins enabled

SQL> @hparam bushy
NAME                              DESCRIPTION                         SESSION_VA
--------------------------------- ----------------------------------- ----------
_optimizer_bushy_cost_factor      cost factor for bushy join          100       
_optimizer_bushy_fact_dim_ratio   bushy join dimension to fact ratio  20        
_optimizer_bushy_fact_min_size    minimumm fact size for bushy join   100000    
_optimizer_bushy_join             enables bushy join                  ON

and the DDL to create the objects are the following

create table f1 as select a.* from dba_objects a, (select 1 from dual connect by rownum <= 2);
create table d1 as select object_id, object_type from dba_objects;
create table f2 as select a.* from dba_objects a, (select 1 from dual connect by rownum <= 2);
create table d2 as select object_id, object_type from dba_objects;

exec dbms_stats.gather_table_stats(user,'F1');
exec dbms_stats.gather_table_stats(user,'D1');
exec dbms_stats.gather_table_stats(user,'D2');
exec dbms_stats.gather_table_stats(user,'F2');

create index f1_idx1 on f1(object_id);
create index f2_idx1 on f2(object_id);
create index f1_idx2 on f1(object_type);
create index f2_idx2 on f2(object_type);

select table_name, num_rows from user_tables where table_name like 'F_' or table_name like 'D_';

TABLE_NAME      NUM_ROWS
------------- ----------
F1                147200
D1                 73601
F2                147204
D2                 73603

The DUAL to duplicate the number of rows in DBA_OBJECTS is just to have more than 100k rows in the two fact tables F1 and F2 (also indexes *IDX1 are never used in my examples but I created them so in the spirit of full disclosure I included them).

select f1.*, f2.* 
  from f1, f2, d1, d2 
 where f1.object_type = f2.object_type 
   and d1.object_type = f1.object_type 
   and f2.object_type = d2.object_type 
   and d1.object_id = 123 
   and d2.object_id = 456;

---------------------------------------------------------------------
| Id |Operation                      |Name             | Rows | Cost|
---------------------------------------------------------------------
|   0|SELECT STATEMENT               |                 |  208K|  414|
|*  1| HASH JOIN                     |                 |  208K|  414|
|   2|  NESTED LOOPS                 |                 | 3132 |  207|
|   3|   NESTED LOOPS                |                 | 3132 |  207|
|*  4|    TABLE ACCESS FULL          |D1               |    1 |   58|
|*  5|    INDEX RANGE SCAN           |F1_IDX2          | 3132 |    9|
|   6|   TABLE ACCESS BY INDEX ROWID |F1               | 3132 |  148|
|   7|  VIEW                         |VW_BUSHY_A9E4AA31| 3132 |  207|
|   8|   NESTED LOOPS                |                 | 3132 |  207|
|   9|    NESTED LOOPS               |                 | 3132 |  207|
|* 10|     TABLE ACCESS FULL         |D2               |    1 |   58|
|* 11|     INDEX RANGE SCAN          |F2_IDX2          | 3132 |    9|
|  12|    TABLE ACCESS BY INDEX ROWID|F2               | 3132 |  148|
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("F1"."OBJECT_TYPE"="ITEM_1")
   4 - filter("D1"."OBJECT_ID"=123)
   5 - access("D1"."OBJECT_TYPE"="F1"."OBJECT_TYPE")
  10 - filter("D2"."OBJECT_ID"=456)
  11 - access("F2"."OBJECT_TYPE"="D2"."OBJECT_TYPE")

From the execution plan D1 and F1 are joined together first and the result is then joined with [the result of the join between D2 and F2].
In this case this is a good idea since the two large fact tables are joined together without any dimension in between, thus the filtering introduced by the two dimensions cannot be applied before a large (and filtered) fact table is joined with another large (and not filtered) fact table. Btw what I just said is a bit incorrect since a merge join cartesian between the two dimensions wouldn’t be a too bad idea in this case (it’s actually what the CBO does once Bushy Joins are disabled).

Let’s take a look under the hood, aka 10053 trace 🙂
As mentioned before Bushy Joins is a CBO transformation, plugged into CBQT framework.
For some more details about the whole transformation stuff you can look here
(Notice I skipped a few lines in the output just to make it shorter, the skipped content was either blank or not important here)
The first interesting bit is the query block is copied (1066), it’s nothing new but it’s worth noticing. Also interesting are lines 1074 and 1075 with the same message and a bit of grammar puzzle.

 1064 BJ: Checking validity for bushy join for query block SEL$1 (#1)
 1066 Registered qb: SEL$1 0x22686940 (COPY SEL$1)
 1071 ****************************************
 1072  Cost-Based Bushy Join
 1073 ****************************************
 1074 BJ: Checking validity of bushy join for query block SEL$1 (#1)
 1075 BJ: Checking validity for bushy join for query block SEL$1 (#1)

In this case a linear search type (details here) is used for the Bushy Joins, starting from the transformation NOT applied. Notice all the four tables are involved at this point (0,0,0,0) and they don’t seem to be “qualified” yet, also this is a bit different than usual since the 4 placeholder here seem to be for tables and not for query blocks (like in other transformations, for example subquery unnesting).
The interesting part is at the end of the costing (physical optimizer) the tables are “identified” as dimensions and facts, including validating stats and structure (the “sructure” typo isn’t mine 😛 ).

 1082 BJ: Starting iteration 1, state space = (0,0,0,0) : (0,0,0,0)
 1083 BJ: Original query
 (physical optimizer here)
 3787 BJ: fact stats valid: F2 [F2]
 3788 BJ: dim stats valid: D2 [D2]
 3789 BJ: dim structure valid: D2 [D2]
 3790 BJ: fact sructure valid: F2 [F2]
 3791 BJ: fact stats valid: F1 [F1]
 3792 BJ: dim stats valid: D1 [D1]
 3793 BJ: dim structure valid: D1 [D1]
 3794 BJ: fact sructure valid: F1 [F1]
 3798 BJ: Updated best state, Cost = 1063.107718

At this point something interesting happens, the CBQT framework (my guess here, easily wrong) starts to focus only on a subset of the objects to consider for the search space / transformation, notice the (4,3).
I couldn’treliably match those 4 and 3 with something in the 10053 (they aren’t query block numbers since there is only one starting query block SEL$1) but an educated guess is 4 is the “substree D1,F1” while 3 is “substree D2,F2”.

 3799 BJ: Starting iteration 2, state space = (4,3) : (0,1)
 3800 Registered qb: SEL$A9E4AA31 0x22425a70 (QUERY BLOCK TABLES CHANGED SEL$1)
 3804   signature (): qb_name=SEL$A9E4AA31 nbfros=3 flg=0
 3805     fro(0): flg=0 objn=83772 hint_alias="D1"@"SEL$1"
 3806     fro(1): flg=0 objn=83771 hint_alias="F1"@"SEL$1"
 3807     fro(2): flg=5 objn=0 hint_alias="VW_BUSHY_A9E4AA31"@"SEL$A9E4AA31"
 3808
 3809 Registered qb: SEL$9F959E4D 0x22420780 (SPLIT/MERGE QUERY BLOCKS SEL$A9E4AA31)
 3813   signature (): qb_name=SEL$9F959E4D nbfros=2 flg=0
 3814     fro(0): flg=0 objn=83774 hint_alias="D2"@"SEL$1"
 3815     fro(1): flg=0 objn=83773 hint_alias="F2"@"SEL$1"
 3816
 3817 Registered qb: SEL$F04E7C56 0x22425a70 (QUERY BLOCK HAS BUSHY JOIN SEL$1; SEL$1; LIST)
 3821   signature (): qb_name=SEL$F04E7C56 nbfros=3 flg=0
 3822     fro(0): flg=0 objn=83772 hint_alias="D1"@"SEL$1"
 3823     fro(1): flg=0 objn=83771 hint_alias="F1"@"SEL$1"
 3824     fro(2): flg=1 objn=0 hint_alias="VW_BUSHY_A9E4AA31"@"SEL$A9E4AA31"

and the transformed SQL becomes (after reintroducind the crappy “select *” I have in my SQL and doing a little formatting), which is nothing surprising if you looked at the previous linked articles.

 3828 SELECT *  
        FROM  (SELECT * 
                 FROM "MPAGANO"."F2" "F2",
                      "MPAGANO"."D2" "D2" 
                WHERE "D2"."OBJECT_ID"=456 
                  AND "F2"."OBJECT_TYPE"="D2"."OBJECT_TYPE") "VW_BUSHY_A9E4AA31",
              "MPAGANO"."F1" "F1",
              "MPAGANO"."D1" "D1" 
       WHERE "F1"."OBJECT_TYPE"="VW_BUSHY_A9E4AA31"."ITEM_1" 
         AND "D1"."OBJECT_TYPE"="F1"."OBJECT_TYPE" 
         AND "D1"."OBJECT_ID"=123

Then the classic approach of any other transformation under CBQT is used (how cool is that? 🙂 ), some more details on how to interpret all this is in the link provided above.

 4617 BJ: Updated best state, Cost = 413.807609
 4618 BJ: Starting iteration 3, state space = (4,3) : (1,0)
 5430 BJ: Not update best state, Cost = 2148.874633
 5431 BJ: Starting iteration 4, state space = (4,3) : (1,1)
 6182 BJ: Not update best state, Cost = 560.862165
 6183 BJ: transformed final query

So Bushy Join transformation will be applied, in details “grouping” together just F2 and D2 (that is (4,3) = (0,1)) since the cost was the lowest, 413.
Just for the sake of completeness here is the transformed SQL (formatted and injected “*” again to shorten it) when both “subtrees” go under Bush Joins

 5458 SELECT * 
        FROM  (SELECT * 
                 FROM "MPAGANO"."F1" "F1",
                      "MPAGANO"."D1" "D1" 
                WHERE "D1"."OBJECT_ID"=123 
                  AND "D1"."OBJECT_TYPE"="F1"."OBJECT_TYPE") "VW_BUSHY_B144F3C9", 
              (SELECT * 
                 FROM "MPAGANO"."F2" "F2",
                      "MPAGANO"."D2" "D2" 
                WHERE "D2"."OBJECT_ID"=456 
                  AND "F2"."OBJECT_TYPE"="D2"."OBJECT_TYPE") "VW_BUSHY_A9E4AA31" 
       WHERE "VW_BUSHY_B144F3C9"."ITEM_1"="VW_BUSHY_A9E4AA31"."ITEM_1"

Few other things worth mentioning below.

From the 10053 it seems Star Transformation is an interleaved transformation for Bushy Joins, even though in my case it was never considered, likely due to the simple nature of my SQL

The transformation can be controlled by the BUSHY_JOIN hint and the syntax is (surprisingly) pretty trivial for simple SQL statement. You can provide within parenthesis the grouping you want Oracle to apply as parameter of the hint. For example in order to force the transformation on both F2,D2 and F1,D1 I can use BUSHY_JOIN((d1 f1) (d2 f2)). Extended and more accurate syntax (including target query block as well as source query block for each table) would be better, but still this is much easier than write an CONCAT hint 🙂

There are several “heuristic” to make Bushy Joins considered, for example the size of the fact table (100k rows according to the parameters above, just guessing here), the number of tables the fact needs to be joined to (another 2 at least), the size ratio between dimensions and fact, etc. I don’t have a complete list but I assume the 10053 would list the reasoning for ignoring Bushy Joins for your SQL (as it did in my case when I looked into this transformation the first time).

There is a lot of guessing in this blog post so if anybody has any correction please let me know and I’ll be happy to make adjustments (and very happy to learn!).
I’m not sure why the feature is disable by default, maybe it’s not super-solid yet, but I think it can have some nice impact on specific cases.

April 29, 2017
by Mauro Pagano 4 Comments

SQLTXPLAIN vs SQLd360, differences and similarities

When talking tools to help with SQL Tuning the question on SQLTXPLAIN vs SQLd360 comes out very often. “What’s the difference?”, “Which one should I use?”, “Why another tool?” are probably the top ones 🙂

I’ll try to make a fair and (as much as possible) unbiased comparison between the two on some general aspects. If you want to know of any specific area just let me know in the comments.

Aspect	SQLTXPLAIN	SQLd360
Installation	Requires installation. Creates two schemas that serve as metadata and code repository, they need to be present every time the tool is executed	No installation, at the end of the execution no evidence is left behind
Source code	Mainly PL/SQL, unwrapped (clear text). Code is Oracle property	SQL scripts, clear text, true open source released under GPL. Code available on GitHub
Main Goal	Automate every aspect of diagnostic collection for SQL tuning	Same as SQLT, but less “obsessive” 🙂
Main Focus	Execution plan generation, why CBO generated plan X,Y,Z (includes custom testcase, very flexible)	Execution plan generation (but a little less than SQLT) and how plans execute at runtime
Main advantage	Includes every bit of info needed to drill into CBO decisions, unlikely the need to go back and collect anything else	Makes diagnostic consumption faster and easier because of data visualization, lots of focus on runtime
Main disadvantage	Requires installation. Not much focus on execution time (“same plan different performance” type of cases harder to deal with)	Might miss some information under rare circumstances

Outside the comparison table, what we use today: SQLd360
Reason(s):

No installation.
Root Cause Analysis of CBO decisions usually belongs to Oracle Support (for the most part), SQL Tuning in the real world focuses on easier aspects of plan generation, thus lower need for strong focus on it.
Performance data available in Oracle grows by release, charting allows quick and way more effective consumption of large amount of such data.

April 27, 2017
by Mauro Pagano 7 Comments

Simple script to remove system-generated column groups

There seem to be a lot of interest (at least on Twitter and at OUG conferences) about Oracle recommendation to install a couple patches on top of 12.1.0.2, in order to emulate 12.2 behavior when it comes to SQL Plan Directives (details here, need MOS account).
One of the things SQL Plan Directives do is trigger column groups (CG) creation.
Column groups are virtual columns representing a hash (SYS_OP_COMBINED_HASH) of the multiple table columns they are defined on (that’s why only equality conditions can be satisfied by CG) and they have an ugly long system-generated name. According to the DECODE in ALL_STAT_EXTENSIONS user-generated GC get a SYS_STU prefix in the name while system-generated one get SYS_STS. As far as I could tell only SPD-triggered CGs are named SYS_STS%(corrections are very welcome here), even those created as consequence of using DBMS_STATS.SEED_COL_USAGE have SYS_STU% name.

The other day somebody asked how to remove those CG in case one wanted to “start fresh” after applying the mentioned 12.1.0.2 patches. I wrote a little script that was by no mean intended to be exhaustive (or fully tested) but was good way to get started removing SPD-triggered CGs so I figured I would share, the script starts from the assumption only SPD-triggered CGs have a SYS_STS% name.
The script does NOT remove the CGs by itself, it just creates another script that include the DROP in there so that you can read, digest and only then execute it manually. Also another script is created, just to put the CG back in place (just the definition, no stats are gathered) in case some are indeed needed. Little side effect is since you put them in place manually then the names become SYS_STU% and not SYS_STS%.

Code below

PRO
PRO usage @drop_extended_stats.sql connected as the user that owns the table
PRO and pass the table name when requested.
PRO
DEF current_table = '&&table_name.'
SET SERVEROUTPUT ON VERI OFF FEED OFF TIMING OFF
SPO drop_extended_stats_&&current_table._driver.sql
BEGIN
 FOR i IN (SELECT extension_name, extension
             FROM user_stat_extensions
            WHERE table_name = UPPER('&&current_table.')
              AND extension_name LIKE 'SYS_STS%' -- SYS_STS are system generated from the DECODE in ALL_STAT_EXTENSIONS
           ) LOOP

   DBMS_OUTPUT.PUT_LINE('-- Dropping extension on '||i.extension);
   DBMS_OUTPUT.PUT_LINE('exec DBMS_STATS.DROP_EXTENDED_STATS(user, ''&&current_table.'', '''||i.extension||''');');

 END LOOP;

END;
/
SPO OFF
SPO create_extended_stats_&&current_table._driver.sql
BEGIN
 FOR i IN (SELECT extension_name, extension
             FROM user_stat_extensions
            WHERE table_name = UPPER('&&current_table.')
              AND extension_name LIKE 'SYS_STS%' -- SYS_STS are system generated from the DECODE in ALL_STAT_EXTENSIONS
           ) LOOP

   DBMS_OUTPUT.PUT_LINE('-- Dropping extension on '||i.extension);
   DBMS_OUTPUT.PUT_LINE('SELECT DBMS_STATS.CREATE_EXTENDED_STATS(user, ''&&current_table.'', '''||i.extension||''') FROM dual;');

 END LOOP;

END;
/
SPO OFF
SET SERVEROUTPUT OFF VERI ON FEED ON TIMING ON

UPDATE: just learned about this MOS note

April 15, 2017
by Mauro Pagano 4 Comments

SQL Monitoring, Flamegraph and Execution Plan Temperature 2.0

Two of the things that I like the most about SQL Monitoring reports are the ability to quickly spot where in the execution plan the time is spent (Activity% column, thank you ASH) and the fact you can collapse part of the plan. Too bad the two don’t “work” together meaning if you collapse a part of the plan the Activity% is not rolled up at the collapsed level. I understand why it works that way (it might be confusing otherwise) but I’d still like to be able to collapsed a node and get a “subtree Activity%” so I know if that subtree is something I should be worry about or not (kind of…).

Pause that for a second, Flamegraph on the other hand is really good at showing how much (in %) a “subtree” (function + its callees) accounts overall, Luca Canali did an (as usual) awesome job here with something similar, but a bit more sophisticated than just execution plan. Unfortunately the fact the elements are not sorted on time (I think most “Oracle people” tend to expect time on the X-axis) combined with the kind of random color used for each bar make this representation a little less intuitive than I’d like it to be for execution plans (don’t get me wrong, I still love this representation a lot!).

SQLd360 represents execution plans as tree since one of the first releases and for over a year the color of the nodes represent a heatmap of where time has been spent (same as Activity% in SQL Monitoring), starting from a couple releases ago there are a couple improvements I hope (like to hear you feedback on this) fixed the shortcomings of both SQL Monitoring and Flamegraph when it comes to digest execution plans.
Each step in the execution plan can be collapsed (this was since day 1) and now its color will change to represent the color the node would have had if all the time the SQL spent on the node itself plus its child steps was spent on the node itself, on expand the colors go back to original.
Also when mouse overing the node a new piece of info will show up in the popup, a “subtree impact” that represent exactly what % of time the node plus its children account for.

For example from the picture above step 16 accounted for 15.52% of the time, while all the subtree under the node (including it too) accounted for 39.6% of the time. Expanding the node the color will go back to a “lighter shade orange”.

I think one of the main advantage of this approach will be the ability to consume large execution plans quickly, collapsing large parts of them and focusing on just what matters, how those parts of the plan interacts with each others, etc etc.

As usual feedbacks, corrections, ideas are MORE than welcome! 🙂

NB: I rewrote the SQL that is under the tree representation to make it more readable but just to make sure I didn’t break the functionality as of now there are two plan trees per plan (and per Top N exec), the one “with subtree” implements what just described above. The old representation will go away down the road, when I’m sure the new SQL is solid enough.

March 26, 2017
by Mauro Pagano Leave a comment

Little help in creating SQL Plan Baselines

One of the main design goals behind SQLd360 is to have no installation nor “evidence” left in the database, i.e. there is no SQLd360 repository in the database while there is a SQLTXPLAIN one (this isn’t necessarily bad, it’s just a different approach).

As a consequence several little things SQLT provided are gone with SQLd360, for example few years ago (it’s been disabled by default for a while) SQLT generated a script to create a “custom” SQL Profile for the best performing plan it identified, something similar was happening for SQL Plan Baselines stored in SQL Tuning Set for quick implementation of baselines down the road.

There is a new standalone script shipped with SQLd360, sql/sqld360_create_sql_baseline.sql, that aims at providing the same functionality of SPM+STS provided by SQLT, using the idea of “no evidence left by default”.
The script is NOT executed as part of SQLd360, it can (and needs to) be manually executed at any time, which in turns means there is no need to run SQLd360 for this script to work.
The script requires three parameters:

SQL ID for the SQL of interest
Oracle Pack available in this database ([N]one, [D]iagnostic or [T]uning)
Plan Hash Value you wish to “freeze”

You can provide the parameters directly inline to SQL*Plus (assuming you know the PHV you want to enforce) or just enter them one by one when prompted, the script will list all the PHV it can find around (memory + history) with their performance, helping you in case you don’t recall which PHV you want.

The script does just two things:

It create a SQL Tuning Set (named s_<<SQLID>>_<<PHV>>) and loads the plan details into it, the goal is to “freeze” it in time so info don’t get lost if the plan is aged out of memory, purged from AWR, etc.
Provide copy&paste instructions to create a Baseline based on the plan in the STS either in the current system or in a remote one. The script DOES NOT execute such steps, only prints them at screen. This way you can read, understand, digest and validate them before YOU execute them.

Hopefully it will make it a little bit easier to play with SPM.

Feedbacks, correction, recommendations are welcome as usual!

January 29, 2017
by Mauro Pagano Leave a comment

Presentations on Slideshare

Every once in a while I get asked if I can email the PPT for a session that I delivered. I always say YES (of course) so I figure why not be proactive and upload the material fot the presentations I delivered over the last several months. Under the “Pages” section on the right side of the page there is a new link “Presentations” that takes you to Slideshare.

It’s my first experience with Slideshare and I’m pretty sure I made mistakes along the way so if you see something wrong just let me know (and let me know how to fix it PLEASE 🙂 )

The list of presentation is probably incomplete so if you attended one and see that I forgot to upload it just let me know and I’ll fix that. Also every session comes with trace files / dumps / testcases built to support the investigations but I found no easy to way upload them so I’d still rely on the old “provided upon request, via email” for them.

January 9, 2017
by Mauro Pagano 7 Comments

Truncated CTAS text and SQL Plan Baselines

This is probably not earth-shattering (not that I ever blog earth-shattering things anyway) for many but it does answer a question I got today about “Do you think a truncated SQL text for CTAS affects SPM ability to give me the desired plan?”.

SQL text for CTAS is truncated as result of bug 17982832 (sister bugs 18705302 and20308798 affect 10046 and AWR respectively) but does this affect SPM? Can SPM match on the truncated text? Or maybe can SPM see the whole text and match on the original SQL? Those are the questions I wanted to answer.

As usual a test is worth a thousand expert opinions so here it goes:

SQL> create table test_shane as select * from dba_objects;
SQL> select sql_id, exact_matching_signature, sql_text from v$sql where sql_id = '30ywyzwvy6cqy';
SQL_ID		  EXACT_MATCHING_SIGNATURE SQL_TEXT
------------- ---------------------------- ------------------------
30ywyzwvy6cqy	       6988945084141899327 create table test_sh

SQL text is truncated from above output and the signature (exact since baselines don’t do force_matching) is 6988945084141899327. Let’s create a baseline now

SQL> var n1 number
SQL> exec :n1 := dbms_spm.load_plans_from_cursor_cache('30ywyzwvy6cqy');
SQL> print :n1
	N1
----------
	 1
SQL> select signature, sql_text from dba_sql_plan_baselines;
		SIGNATURE SQL_TEXT
------------------------- ---------------------------------------
     15291506816473784520 create table test_sh

The baseline has been created with signature 15291506816473784520 which is different than the one from V$SQL so it sounds like the baseline used the wrong text to create the signature (or V$SQL was wrong and SPM was smarter, which is unlikely). We can verify it using DBMS_SQLTUNE API to compute the signature of both SQL texts, original and truncated.

SQL> select dbms_sqltune.sqltext_to_signature('create table test_sh') signature from dual;
			       SIGNATURE
----------------------------------------
		    15291506816473784520
SQL> select dbms_sqltune.sqltext_to_signature('create table test_shane as select * from dba_objects') signature from dual;
			       SIGNATURE
----------------------------------------
		     6988945084141899327

So V$SQL was right while SPM kind of used (I’m saying “kind of” because tracing the call to DBMS_SPM it seems like a different way is used to pull up the signature) the truncated text to compute and store the incorrect signature and as a consequence the baseline won’t be used because the correct signature won’t find any match in the SMB.
From 10053 from a hard parse

SPM: statement not found in SMB

Testing it in 12.2 where the SQL text is intact the baseline is created using the proper signature

	      SIGNATURE SQL_TEXT
----------------------- ------------------------------------------------------
    6988945084141899327 create table test_shane as select * from dba_objects

Note
-----
   - SQL plan baseline SQL_PLAN_61zdshtmgv3jzd9546056 used for this statement

… and the baseline is correctly used 🙂

November 28, 2016
by Mauro Pagano 5 Comments

Something new about SQL Plan Directives and 12.2

SQL Plan Directives (SPD) remind me of bind peeking about 10 years ago, a nice feature on paper that worked fine most of the times but caused some major headache on those occasions when it didn’t. Luckily for bind peeking social media wasn’t that popular 10y ago so it took a while to “make a name” for a troublemaking feature, nowadays a couple of blog posts and many re-tweets to get pretty popular, poor SPD!
DISCLAIMER: I happen to like a lot both features and think positively of them, I just admit the “transparent” behavior was a little too ambitious 😦

Just like for bind peeking (Adaptive Cursor Sharing), Oracle introduced some changes in 12.2 to mitigate the effect of SQL Plan Directives and Adaptive Dynamic Sampling (DS). Parameter OPTIMIZER_ADAPTIVE_FEATURES has been split in two, basically turning SPD off via parameter OPTIMIZER_ADAPTIVE_STATISTICS, as Franck Pachot already explained here.

The goal of this post is look into what happens when we manually turn on OPTIMIZER_ADAPTIVE_STATISTICS. Focus will be on SQL Plan Directives alone instead of throwing DBMS_STATS in the mix too (especially since fix for bug 21171382 disables automatic column group creation based on info from SPD).

Let’s create a table with strong correlation across columns and see how 12.2 SPD behaves differently than 12.1

drop table tab1 purge;
create table tab1 (n1 number, n2 number, n3 number);
insert into tab1 select mod(rownum, 100), mod(rownum, 100), mod(rownum, 100) from dual connect by rownum <= 100000;
commit;
exec dbms_stats.gather_table_stats(user,'TAB1');

Table TAB1 has 100k rows with three numeric columns, all storing the same exact value and I’ll use the following SQL referencing all the three columns to lead the CBO into a trap (caused by lack of column group on the three columns).

select count(*) from tab1 where n1 = 1 and n2 = 1 and n3 = 1;

Let’s run the SQL a couple times flushing SPD and shared_pool after each execution just so that each change is persisted immediately and we can see the impact.

Run the SQL the first time and from DBA_SQL_PLAN_DIRECTIVES we have:

TYPE                    STATE      REASON                               NOTES 
----------------------- ---------- ------------------------------------ -------------------------------------------------------
DYNAMIC_SAMPLING        USABLE     SINGLE TABLE CARDINALITY MISESTIMATE <spd_note> 
                                                                         <internal_state>NEW</internal_state> 
                                                                         <redundant>NO</redundant> 
                                                                         <spd_text>{EC(MPAGANO.TAB1)[N1, N2, N3]}</spd_text> 
                                                                        </spd_note>

Same result from both 12.1 and 12.2, nothing too outstanding happens behind the curtains either beside the recursive SQLs related to SPD now have a nice “/* QOSD */” comment in there so they are easier to spot.

Run the SQL again, the CBO recognizes it doesn’t have sufficient info to make a good decisions thus ADS is triggered and SPD <internal_state> is updated to MISSING_STATS, pretty much same behavior in both 12.1 and 12.2 except in 12.2 there is no RESULT_CACHE hint in the DS SQL.
This is where things get interesting, in 12.2 there is one additional directive created to store the result of DS:

TYPE                    STATE      REASON                               NOTES
----------------------- ---------- ------------------------------------ ----------------------------------------------------------------------------------------------------
DYNAMIC_SAMPLING_RESULT USABLE     VERIFY CARDINALITY ESTIMATE          <spd_note>
                                                                         <internal_state>NEW</internal_state>
                                                                         <redundant>NO</redundant>
                                                                         <spd_text>{(MPAGANO.TAB1, num_rows=100000) - (SQL_ID:7pjdz422db2xk, T.CARD=1000[-2 -2])}</spd_text>
                                                                        </spd_note>

which allows the result not to be flushed out and survive a restart of the database, thus reducing the amount of DS SQL executed.

The spd_text provides info about the original number of rows (100k) and the number of rows returned by DS (1k), and it includes a SQL ID (7pjdz422db2xk) that sounds like the DS one, even though it’s just a slight variation of it, let’s look into it.
The recursive DS SQL executed is 0nkvqpzha4x88

PARSING IN CURSOR #140370085840592 len=301 dep=1 uid=106 oct=3 lid=106 tim=588259148047 hv=3768743176 ad='73b99738' sqlid='0nkvqpzha4x88'
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel */ NVL(SUM(C1),0) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TAB1") */ 1 AS C1 FROM "TAB1" "TAB1" WHERE ("TAB1"."N1"=1) AND ("TAB1"."N2"=1) AND ("TAB1"."N3"=1)) innerQuery

but the fix for 22760704 says “Generate ADS stmt cache key without hints to make it determinist”, which makes it safe to guess the SQL ID stored is a stripped down version of the original SQL text, with the goal of increasing the chance of matching it back when retrieving info from the SPD DS result.

It seems the CBO tries to find DS result using the SQL text of the DS SQL that it would like to execute (SQL ID 0nkvqpzha4x88)

SPD: qosdGetFObjKeyFromStmt: sqlText = SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel  */ NVL(SUM(C1),0) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TAB1")  */ 1 AS C1 FROM "TAB1" "TAB1" WHERE ("TAB1"."N1"=1) AND ("TAB1"."N2"=1) AND ("TAB1"."N3"=1)) innerQuery (objid = 8846687189427653554)
>> Single Tab Card adjusted from 0.100000 to 1000.000000 due to adaptive dynamic sampling

but internally it modifies the SQL to search for such stripped down version without the hints in the external query block, this is visible via trace[ADS]

kkoadsComputeSqlid: sql_id=8846687189427653554: newText=SELECT /* DS_SVC */  NVL(SUM(C1),0) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TAB1")  */ 1 AS C1 FROM "TAB1" "TAB1" WHERE ("TAB1"."N1"=1) AND ("TAB1"."N2"=1) AND ("TAB1"."N3"=1)) innerQuery, startPos = 20, stopPos = 120, newTextLen = 203, sqlLen = 302

and this is the SQL_ID 7pjdz422db2xk stored in the SPD.

It’s too early to know if these changes will address the existing concerns, especially since many sites will run with this feature disabled (by default) but I really like the way the “new” SPD seems to work!!

In summary the most notable changes in 12.2 seems to be:

SPD still triggeres ADS but the result is stored inside SPD itself and persisted
Search in the SPD repository for DS results are made using a key that represent (a cleaned version of) the DS SQL that generated the result in the first place

As usual, feedbacks / corrections / additions are very much welcome!