Mauro Pagano's Blog


2 Comments

SQLTXPLAIN vs SQLd360, differences and similarities

When talking tools to help with SQL Tuning the question on SQLTXPLAIN vs SQLd360 comes out very often. “What’s the difference?”, “Which one should I use?”, “Why another tool?” are probably the top ones ūüôā

I’ll try to make a fair and (as much as possible) unbiased comparison between the two on some general aspects. If you want to know of any specific area just let me know in the comments.

Aspect SQLTXPLAIN SQLd360
Installation Requires installation. Creates two schemas that serve as metadata and code repository, they need to be present every time the tool is executed No installation, at the end of the execution no evidence is left behind
Source code Mainly PL/SQL, unwrapped (clear text). Code is Oracle property SQL scripts, clear text, true open source released under GPL. Code available on GitHub
Main Goal Automate every aspect of diagnostic collection for SQL tuning Same as SQLT, but less “obsessive” ūüôā
Main Focus Execution plan generation, why CBO generated plan X,Y,Z (includes custom testcase, very flexible) Execution plan generation (but a little less than SQLT) and how plans execute at runtime
Main advantage Includes every bit of info needed to drill into CBO decisions, unlikely the need to go back and collect anything else Makes diagnostic consumption faster and easier because of data visualization, lots of focus on runtime
Main disadvantage Requires installation. Not much focus on execution time (“same plan different performance” type of cases harder to deal with) Might miss some information under rare circumstances

Outside the comparison table, what we use today: SQLd360
Reason(s):

  1. No installation.
  2. Root Cause Analysis of CBO decisions usually belongs to Oracle Support (for the most part), SQL Tuning in the real world focuses on easier aspects of plan generation, thus lower need for strong focus on it.
  3. Performance data available in Oracle grows by release, charting allows quick and way more effective consumption of large amount of such data.


6 Comments

Simple script to remove system-generated column groups

There seem to be a lot of interest (at least on Twitter and at OUG conferences) about Oracle recommendation to install a couple patches on top of 12.1.0.2, in order to emulate 12.2 behavior when it comes to SQL Plan Directives (details here, need MOS account).
One of the things SQL Plan Directives do is trigger column groups (CG) creation.
Column groups are virtual columns representing a hash (SYS_OP_COMBINED_HASH) of the multiple table columns they are defined on (that’s why only equality conditions can be satisfied by CG) and they have an ugly long system-generated name. According to the DECODE in ALL_STAT_EXTENSIONS user-generated GC get a SYS_STU prefix in the name while system-generated one get SYS_STS. As far as I could tell only SPD-triggered CGs are named SYS_STS%(corrections are very welcome here), even those created as consequence of using DBMS_STATS.SEED_COL_USAGE have SYS_STU% name.

The other day somebody asked how to remove those CG in case one wanted to “start fresh” after applying the mentioned 12.1.0.2 patches.¬†I wrote a little script that was by no mean intended to be exhaustive (or fully tested) but was good way to get started removing SPD-triggered CGs so I figured I would share, the script starts from the assumption only SPD-triggered CGs have a SYS_STS% name.
The script does NOT remove the CGs by itself, it just creates another script that include the DROP in there so that you can read, digest and only then execute it manually. Also another script is created, just to put the CG back in place (just the definition, no stats are gathered) in case some are indeed needed. Little side effect is since you put them in place manually then the names become SYS_STU% and not SYS_STS%.

Code below

PRO
PRO usage @drop_extended_stats.sql connected as the user that owns the table
PRO and pass the table name when requested.
PRO
DEF current_table = '&&table_name.'
SET SERVEROUTPUT ON VERI OFF FEED OFF TIMING OFF
SPO drop_extended_stats_&&current_table._driver.sql
BEGIN
 FOR i IN (SELECT extension_name, extension
             FROM user_stat_extensions
            WHERE table_name = UPPER('&&current_table.')
              AND extension_name LIKE 'SYS_STS%' -- SYS_STS are system generated from the DECODE in ALL_STAT_EXTENSIONS
           ) LOOP

   DBMS_OUTPUT.PUT_LINE('-- Dropping extension on '||i.extension);
   DBMS_OUTPUT.PUT_LINE('exec DBMS_STATS.DROP_EXTENDED_STATS(user, ''&&current_table.'', '''||i.extension||''');');

 END LOOP;

END;
/
SPO OFF
SPO create_extended_stats_&&current_table._driver.sql
BEGIN
 FOR i IN (SELECT extension_name, extension
             FROM user_stat_extensions
            WHERE table_name = UPPER('&&current_table.')
              AND extension_name LIKE 'SYS_STS%' -- SYS_STS are system generated from the DECODE in ALL_STAT_EXTENSIONS
           ) LOOP

   DBMS_OUTPUT.PUT_LINE('-- Dropping extension on '||i.extension);
   DBMS_OUTPUT.PUT_LINE('SELECT DBMS_STATS.CREATE_EXTENDED_STATS(user, ''&&current_table.'', '''||i.extension||''') FROM dual;');

 END LOOP;

END;
/
SPO OFF
SET SERVEROUTPUT OFF VERI ON FEED ON TIMING ON


4 Comments

SQL Monitoring, Flamegraph and Execution Plan Temperature 2.0

Two¬†of the things that I like the most about SQL Monitoring reports are the ability to quickly spot where in the execution plan the time is spent (Activity% column, thank you ASH) and the fact you can collapse part of the plan. Too bad the two don’t “work” together meaning if you collapse a part of the plan the Activity% is not rolled up at the collapsed level. I understand why it works that way (it might be confusing otherwise) but I’d still like to be able to collapsed a node and get a “subtree Activity%” so I know if that subtree is something I should be worry about or not (kind of…).

Pause that for a second, Flamegraph on the other hand is really good at showing¬† how much (in %) a¬†“subtree” (function + its callees) accounts overall, Luca Canali did an (as usual) awesome job here with something similar, but a bit more sophisticated than just execution plan. Unfortunately the fact the elements are not sorted on time (I think most “Oracle people” tend to expect¬†time on the X-axis) combined with¬†the kind of random color used for each bar make this representation¬†a little less intuitive than I’d like it to be for execution plans (don’t get me wrong, I still love this representation a lot!).

SQLd360 represents execution plans as tree since one of the first releases and for over a year the color of the nodes represent a heatmap of where time has been spent (same as Activity% in SQL Monitoring), starting from a couple releases ago there are a couple improvements I hope (like to hear you feedback on this) fixed the shortcomings of both SQL Monitoring and Flamegraph when it comes to digest execution plans.
Each step in the execution plan can be collapsed (this was since day 1) and now its color will change to represent the color the node would have had if all the time the SQL spent on the node itself plus its child steps was spent on the node itself, on expand the colors go back to original.
Also when mouse overing the node a new piece of info will show up in the popup, a “subtree impact” that represent exactly what % of time the node plus its children¬†account for.

node.png

For example from the picture above step 16 accounted for 15.52% of the time, while all the subtree under the node (including it too) accounted for 39.6% of the time. Expanding the node the color will go back to a “lighter shade orange”.

I think one of the main advantage of this approach will be the ability to consume large execution plans quickly, collapsing large parts of them and focusing on just what matters, how those parts of the plan interacts with each others, etc etc.

As usual feedbacks, corrections, ideas are MORE than¬†welcome! ūüôā

NB: I rewrote the SQL that is under the tree representation to make it more readable but just to make sure I didn’t break the functionality as of now there are two plan trees per plan (and per Top N exec), the one “with subtree” implements what just described above. The old representation will go away down the road, when I’m sure the new SQL is solid enough.


Leave a comment

Little help in creating SQL Plan Baselines

One of the main design goals behind SQLd360¬†is to have no installation nor “evidence” left in the database, i.e. there is no SQLd360 repository in the database while there is a SQLTXPLAIN one (this isn’t necessarily bad, it’s just a different approach).

As a consequence several little¬†things SQLT provided are gone with SQLd360, for example few¬†years ago (it’s been disabled by default for a while) SQLT generated¬†a script to create a “custom” SQL Profile for the best performing plan it identified, something similar was happening for SQL Plan Baselines stored in¬†SQL Tuning Set for quick implementation of baselines down the road.

There is a new standalone script shipped with SQLd360, sql/sqld360_create_sql_baseline.sql, that aims at¬†providing the same functionality of SPM+STS provided by SQLT, using the idea of “no evidence left by default”.
The script is NOT executed as part of SQLd360, it can (and needs to) be manually executed at any time, which in turns means there is no need to run SQLd360 for this script to work.
The script requires three parameters:

  • SQL ID for the SQL of interest
  • Oracle Pack available in this database ([N]one, [D]iagnostic or [T]uning)
  • Plan Hash Value you wish to “freeze”

You can provide the parameters directly inline to SQL*Plus (assuming you know the PHV you want to enforce) or just enter them one by one when prompted, the script will list all the PHV it can find around (memory + history) with their¬†performance, helping you in case you don’t recall which PHV you want.

The script does just two things:

  1. It create a SQL Tuning Set (named¬†s_<<SQLID>>_<<PHV>>) and loads the plan details into it, the goal is to “freeze” it in time so info¬†don’t get lost if the plan is aged out of memory, purged from AWR, etc.
  2. Provide copy&paste instructions to create a Baseline based on the plan in the STS either in the current system or in a remote one. The script DOES NOT execute such steps, only prints them at screen. This way you can read, understand, digest and validate them before YOU execute them.

Hopefully it will make it a little bit easier to play with SPM.

Feedbacks, correction, recommendations are welcome as usual!


Leave a comment

Presentations on Slideshare

Every once in a while I get asked if I can email the PPT for a session that I delivered. I always say YES (of course) so I figure why not be proactive and upload the material fot the presentations I delivered over the last several months. Under the “Pages” section on the right side of the page there is a new link “Presentations” that takes you to Slideshare.

It’s my first experience with Slideshare and I’m pretty sure I made mistakes along the way so if you see something wrong just let me know (and let me know how to fix it PLEASE ūüôā )

The list of presentation is probably incomplete so if you attended one and see that I forgot to upload it just let me know and I’ll fix that. Also every session comes with trace files / dumps / testcases built to support the investigations but I found no easy to way upload them so I’d still rely on the old “provided upon request, via email” for them.

 


6 Comments

Truncated CTAS text and SQL Plan Baselines

This is probably not earth-shattering (not that I ever blog earth-shattering things anyway) for many but it does answer a question I got today about “Do you think a truncated SQL text for CTAS affects SPM ability to give me the desired plan?”.

SQL text for CTAS is truncated as result of bug 17982832 (sister bugs 18705302 and20308798 affect 10046 and AWR respectively) but does this affect SPM? Can SPM match on the truncated text? Or maybe can SPM see the whole text and match on the original SQL? Those are the questions I wanted to answer.

As usual a test is worth a thousand expert opinions so here it goes:

SQL> create table test_shane as select * from dba_objects;
SQL> select sql_id, exact_matching_signature, sql_text from v$sql where sql_id = '30ywyzwvy6cqy';
SQL_ID		  EXACT_MATCHING_SIGNATURE SQL_TEXT
------------- ---------------------------- ------------------------
30ywyzwvy6cqy	       6988945084141899327 create table test_sh

SQL text is truncated from above output and the signature (exact since baselines don’t do force_matching) is 6988945084141899327. Let’s create a baseline now

SQL> var n1 number
SQL> exec :n1 := dbms_spm.load_plans_from_cursor_cache('30ywyzwvy6cqy');
SQL> print :n1
	N1
----------
	 1
SQL> select signature, sql_text from dba_sql_plan_baselines;
		SIGNATURE SQL_TEXT
------------------------- ---------------------------------------
     15291506816473784520 create table test_sh

The baseline has been created with signature 15291506816473784520 which is different than the one from V$SQL so it sounds like the baseline used the wrong text to create the signature (or V$SQL was wrong and SPM was smarter, which is unlikely). We can verify it using DBMS_SQLTUNE API to compute the signature of both SQL texts, original and truncated.

SQL> select dbms_sqltune.sqltext_to_signature('create table test_sh') signature from dual;
			       SIGNATURE
----------------------------------------
		    15291506816473784520
SQL> select dbms_sqltune.sqltext_to_signature('create table test_shane as select * from dba_objects') signature from dual;
			       SIGNATURE
----------------------------------------
		     6988945084141899327

So V$SQL was right while SPM kind of used (I’m saying “kind of” because tracing the call to DBMS_SPM it seems like a different way is used to pull up the signature) the truncated text to compute and store the incorrect signature and as a consequence the baseline won’t be used because the correct signature won’t find any match in the SMB.
From 10053 from a hard parse

SPM: statement not found in SMB

Testing it in 12.2 where the SQL text is intact the baseline is created using the proper signature

	      SIGNATURE SQL_TEXT
----------------------- ------------------------------------------------------
    6988945084141899327 create table test_shane as select * from dba_objects

Note
-----
   - SQL plan baseline SQL_PLAN_61zdshtmgv3jzd9546056 used for this statement

… and the baseline is correctly used ūüôā


5 Comments

Something new about SQL Plan Directives and 12.2

SQL Plan Directives (SPD) remind me of bind peeking about 10 years ago, a nice feature on paper that worked fine most of the times but caused some major headache on those occasions when it didn’t. Luckily for bind peeking social media wasn’t that popular 10y ago so it took a while to “make a name” for a troublemaking feature, nowadays a couple of blog posts and many re-tweets to get pretty popular, poor SPD!
DISCLAIMER: I happen to like a lot both features and think positively of them, I just admit the “transparent” behavior¬†was a little too ambitious ūüė¶

Just like for bind peeking (Adaptive Cursor Sharing), Oracle introduced some changes in 12.2 to mitigate the effect of SQL Plan Directives and Adaptive Dynamic Sampling (DS). Parameter OPTIMIZER_ADAPTIVE_FEATURES has been split in two, basically turning SPD off via parameter OPTIMIZER_ADAPTIVE_STATISTICS,  as Franck Pachot already explained here.

The goal of this post is look into what happens when we manually turn on OPTIMIZER_ADAPTIVE_STATISTICS. Focus will be on SQL Plan Directives alone instead of throwing DBMS_STATS in the mix too (especially since fix for bug 21171382 disables automatic column group creation based on info from SPD).

Let’s create a table with strong correlation across columns and see how 12.2 SPD behaves differently than 12.1

drop table tab1 purge;
create table tab1 (n1 number, n2 number, n3 number);
insert into tab1 select mod(rownum, 100), mod(rownum, 100), mod(rownum, 100) from dual connect by rownum <= 100000;
commit;
exec dbms_stats.gather_table_stats(user,'TAB1');

Table TAB1 has 100k rows with three numeric columns, all storing the same exact value and I’ll use the following SQL referencing all the three columns to lead the CBO into a trap (caused by lack of column group on the three columns).

select count(*) from tab1 where n1 = 1 and n2 = 1 and n3 = 1;

Let’s run the SQL a couple¬†times flushing SPD and shared_pool after each execution just so that each change is persisted immediately and we can see the impact.

Run the SQL the first time and from DBA_SQL_PLAN_DIRECTIVES we have:

TYPE                    STATE      REASON                               NOTES 
----------------------- ---------- ------------------------------------ -------------------------------------------------------
DYNAMIC_SAMPLING        USABLE     SINGLE TABLE CARDINALITY MISESTIMATE <spd_note> 
                                                                         <internal_state>NEW</internal_state> 
                                                                         <redundant>NO</redundant> 
                                                                         <spd_text>{EC(MPAGANO.TAB1)[N1, N2, N3]}</spd_text> 
                                                                        </spd_note>

Same result from both 12.1 and 12.2, nothing too outstanding happens behind the curtains either beside the recursive SQLs related to SPD now have a nice “/* QOSD */” comment in there so they are easier to spot.

Run the SQL again, the CBO recognizes it doesn’t have sufficient info to make a good decisions thus ADS is triggered and SPD <internal_state> is updated to MISSING_STATS, pretty much same behavior in both 12.1 and 12.2 except in 12.2 there is no RESULT_CACHE¬†hint in the DS SQL.
This is where things get interesting, in 12.2 there is one additional directive created to store the result of DS:

TYPE                    STATE      REASON                               NOTES
----------------------- ---------- ------------------------------------ ----------------------------------------------------------------------------------------------------
DYNAMIC_SAMPLING_RESULT USABLE     VERIFY CARDINALITY ESTIMATE          <spd_note>
                                                                         <internal_state>NEW</internal_state>
                                                                         <redundant>NO</redundant>
                                                                         <spd_text>{(MPAGANO.TAB1, num_rows=100000) - (SQL_ID:7pjdz422db2xk, T.CARD=1000[-2 -2])}</spd_text>
                                                                        </spd_note>

which allows the result not to be flushed out and survive a restart of the database, thus reducing the amount of DS SQL executed.

The spd_text provides info about the original number of rows (100k) and the number of rows returned by DS (1k), and it includes a SQL ID (7pjdz422db2xk) that sounds like the DS one, even though it’s just a slight variation of it, let’s look into it.
The recursive DS SQL executed is 0nkvqpzha4x88

PARSING IN CURSOR #140370085840592 len=301 dep=1 uid=106 oct=3 lid=106 tim=588259148047 hv=3768743176 ad='73b99738' sqlid='0nkvqpzha4x88'
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel */ NVL(SUM(C1),0) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TAB1") */ 1 AS C1 FROM "TAB1" "TAB1" WHERE ("TAB1"."N1"=1) AND ("TAB1"."N2"=1) AND ("TAB1"."N3"=1)) innerQuery

but the fix for¬†22760704 ¬†says “Generate ADS stmt cache key without hints to make it determinist”, which makes it safe to guess the SQL ID stored¬†is a stripped down version of the original SQL text, with the goal of increasing the chance of matching it back when retrieving info from the SPD DS result.

It seems the CBO tries to find DS result using the SQL text of the DS SQL that it would like to execute (SQL ID 0nkvqpzha4x88)

SPD: qosdGetFObjKeyFromStmt: sqlText = SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel  */ NVL(SUM(C1),0) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TAB1")  */ 1 AS C1 FROM "TAB1" "TAB1" WHERE ("TAB1"."N1"=1) AND ("TAB1"."N2"=1) AND ("TAB1"."N3"=1)) innerQuery (objid = 8846687189427653554)
>> Single Tab Card adjusted from 0.100000 to 1000.000000 due to adaptive dynamic sampling

but internally it modifies the SQL to search for such stripped down version without the hints in the external query block, this is visible via trace[ADS]

kkoadsComputeSqlid: sql_id=8846687189427653554: newText=SELECT /* DS_SVC */  NVL(SUM(C1),0) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TAB1")  */ 1 AS C1 FROM "TAB1" "TAB1" WHERE ("TAB1"."N1"=1) AND ("TAB1"."N2"=1) AND ("TAB1"."N3"=1)) innerQuery, startPos = 20, stopPos = 120, newTextLen = 203, sqlLen = 302

and this is the SQL_ID 7pjdz422db2xk stored in the SPD.

It’s too early to know if these changes will address the existing concerns, especially since many sites will run with this feature disabled (by default) but I really like the way the “new” SPD seems to work!!

In summary the most notable changes in 12.2 seems to be:

  • SPD still triggeres ADS but the result is stored inside SPD itself and persisted
  • Search in the SPD repository for DS results are made using a key that represent (a cleaned version of) the DS SQL that generated the result in the first place

As usual, feedbacks / corrections / additions are very much welcome!