Mauro Pagano's Blog


7 Comments

Introducing Pathfinder, is there a better plan for my SQL?

Pathfinder is a new free tool that provides an easy way to execute a SQL statement under multiple optimizer environments in order to generate different execution plans, potentially discovering better plans. The tool can also be used to quickly identify workarounds for wrong result bugs as well as slow parse issues.

Pathfinder uses the same brute-force approach of SQLT XPLORE, executing the SQL for every single CBO parameter and fix_control present in the database, with no installation required. This make Pathfinder easy to run in any environment, including a production one (assuming you understand *WELL* what the tool does, how it works and what it means for your database).

Each test adds approximately 1 second overhead to the time the SQL takes to complete and the amount of tests considered is pretty high, in 11.2.0.4 it’s around 1100 and around 1500 in 12.1.0.2, thus I suggest to use Pathfinder on SQLs that take at most a few seconds to run (or just be ready to leave Pathfinder run for a loooong time).

The tool executes the SQL statement present in file script.sql (provided), just modify the script and replace the seeded SQL with the one you want to execute. In the same script you can also add ALTER SESSION commands that will be executed before the desired SQL, this is helpful in case you want to influence the analysis providing a different starting point.

To execute the tool just download it from the Download section on the right side of this page (or from here, also the tool will be released as standalone script in the same zip file as SQLd360) and follow these steps:

  1. Unzip pathfinder-master.zip, navigate to the root pathfinder directory, and connect as SYS to the database.
    $ unzip pathfinder-master.zip
    $ cd pathfinder-master
    $ sqlplus / as sysdba
  2. Open file script.sql and add your SQL in there. Make sure to add the mandatory comment /* ^^pathfinder_testid */. The file name must be script.sql, if you wish to change the name then just ping me.
  3. Execute pathfinder.sql and provide the connect string to connect as the user that is supposed to run the script.
  4. Unzip output file pathfinder_<dbname>_<date>.zip into a directory on your PC and review the results starting from file 00001_pathfinder_<dbname>_<date>_index.html

SQL> @pathfinder

Parameter 1:
Full connect string of the database to run the SQL into
If the database is remote or a PDB then you must include
the TNS alias i.e. scott/tiger@orcl

Enter value for 1: mpagano/mpagano@orcl
mpagano/mpagano@orcl
Building Pathfinder driver scripts
Connected.

1) "pathfinder_{ 20151026_1906 (00001)" 19:06:40 BASELINE

2) "pathfinder_{ 20151026_1906 (00002)" 19:06:42 "_add_stale_mv_to_dependency_list" = FALSE

.....
File pathfinder_orcl_20151023_1256.zip created.

For each test Pathfinder will show the setting considered as well as some basic statistics like Plan Hash Value, Elapsed Time, CPU Time, Buffer Gets and Rows processed. Also two links are present, one points to the details of the execution plan generated while the other points to the details of V$SQL.

The main page will look something like this:

Screen Shot 2015-10-26 at 9.18.44 PM

Pathfinder also considers the effect of Cardinality Feedback executing the SQL multiple times until the plan stops changing (or CFB gives up after the fifth parse), for all those settings that lead to a first execution plan different than the baseline.
This is why for some settings you will see a Test# with an additional digit, the “reparse” number:

Screen Shot 2015-10-26 at 9.22.23 PM

In example for Test# 117 above the CBO generated a different plan (PHV 1837274416) than the baseline (PHV 1838229974) and Cardinality Feedback kicked in 3 times generating a different plan each time, until the 3rd parse when the CBO ended up with the same plan as the first execution (and no different plan was generated after).

This is the first release of the tool so I expect it to be far from perfect but I’ve already used it several times with success. Hopefully with time (and your feedbacks :D) the tool will get better and better.

I hope you enjoy it and please don’t hesitate to get in touch with me for feedbacks, suggestions and bugs!!!


3 Comments

Scalar Subquery Caching and read consistency

Every time I discuss this topic with somebody I get the “what are you talking about?” look on his/her face so this is just one of those “repetita iuvant” posts to make sure we (including me first) don’t get burned by it 😀

Each SQL executed from PL/SQL can run at a different SCN thus “see” different data depending when the SQL is started. It is known, documented and easy to prove.

In session 1:
declare
 l_sal number;
begin
 select sal
   into l_sal
   from emp
  where empno = 7369;
 dbms_output.put_line('First salary: '||l_sal);
 dbms_lock.sleep(10);
 select sal
   into l_sal
   from emp
  where empno = 7369;
 dbms_output.put_line('Second salary: '||l_sal);
end;
/

In session 2 while the previous block is still running:
update emp set sal = 0 where empno = 7369;
commit;

Result will be
First salary: 800
Second salary: 0

The same applies when a SQL calls a PL/SQL program with in turn calls SQLs.
This is documented and easy to prove as well plus a nice presentation from Bryn Llewellyn (PL/SQL Product Manager, I think) is available here. I’ll skip the steps to keep the post short.

So using a SQL calling PL/SQL calling SQL is a big NO if the table accessed by the recursive SQLs aren’t static (read only?) or if you are not willing to get weird results like the following where salary for the same employee is different within the same resultset (and I’ve no idea who would be willing 😀 )

create or replace function get_salary(n1 number) return number is
l_sal number;
begin 
  dbms_lock.sleep(1);
  select sal
    into l_sal
    from emp
   where empno = n1;
  return l_sal;
end;
/
select empno, get_salary(empno) sal 
  from (select /*+ NO_MERGE */ * from emp where empno in (7369, 7499, 7521)) a, 
       (select /*+ NO_MERGE */ rownum from dual connect by rownum <5) b;

     EMPNO	  SAL
---------- ----------
      7369	  800
      7499	 1600
      7521	 1250
      7369	    0
      7499	    0
      7521	    0
      7369	    0
      7499	    0
      7521	    0
      7369	    0
      7499	    0
      7521	    0

Elapsed: 00:00:12.01

Scalar Subquery Caching is a caching mechanism to cache the result of a function call, thus potentially avoiding a call to the function for the same input values and so improving performance. The mechanism is explained in Bryn presentation, Tim Hall video on youtube and lots of other places so no need to add much here 😉

Would Scalar Subquery Caching make my recursive SQL read-consistent to its “parent” SQL?
To be honest I don’t see why it should but I thought about it up to the point of starting to make up weird hypothesis with no foundation so better cut it short and test!

In session 1:
select empno, (select myfunc(empno) from dual) sal 
  from (select /*+ NO_MERGE */ * from emp where empno in (7369, 7499, 7521)) a, 
       (select /*+ NO_MERGE */ rownum from dual connect by rownum <5) b;

In session 2 (need to be quick this time) 
update emp set sal = 0 where empno in (7369, 7499, 7521); 
commit;

and the result in session 1:
     EMPNO	  SAL
---------- ----------
      7369	 1000
      7499	 1000
      7521	    0
      7369	 1000
      7499	 1000
      7521	    0
      7369	 1000
      7499	 1000
      7521	    0
      7369	 1000
      7499	 1000
      7521	    0

Elapsed: 00:00:03.00

The answer is “of course NO, what were you (me) thinking?!?”.
The previous output shows how the scalar subquery caching indeed allowed to call get_salary less times (once per distinct value of EMPNO, 3 in this case) and the elapsed drop from 12 to 3 but the value cached is just whatever the first execution of the function calculated for that EMPNO.


Leave a comment

Little big change in SQLd360

Just a super-quick note on SQLd360, starting with version v1522 (released a couple days ago), CSV and TEXT formats are disabled by default so a report that looked like this before
Screen Shot 2015-08-20 at 8.36.47 AM
will now look like this
Screen Shot 2015-08-20 at 8.37.22 AM
Nothing to be alarmed of, it’s expected!
In case you want the two formats back you can just edit file sql/sqld360_00_config.sql.

The main advantage of this change should be a significant reduction in the time it takes to collect the report (feedbacks welcome as usual) 😀
Also the number of rows per report is computed differently, which should save some extra time!

PS: the same change is in eDB360 v1526


1 Comment

UNUSABLE indexes and TRUNCATE

This is one of those “I knew it works this way, why do I keep falling for it?” so hopefully blogging about it will keep my memory fresh 😀

Starting 11gR2 when you mark an index or an index partition UNUSABLE the segment is dropped (unless you are on 12c and use ONLINE), this is the intended behavior.

SQL> create table t_unusable (n1 number, n2 number);
SQL> create index t_unusable_idx on t_unusable(n1);
SQL> insert into t_unusable values (1,2);  -- this is just to materialize the segment
SQL> select index_name, segment_created from user_indexes where index_name = 'T_UNUSABLE_IDX';

INDEX_NAME		       SEG
------------------------------ ---
T_UNUSABLE_IDX		       YES

SQL> alter index t_unusable_idx unusable;
SQL> select index_name, segment_created from user_indexes where index_name = 'T_UNUSABLE_IDX';

INDEX_NAME		       SEG
------------------------------ ---
T_UNUSABLE_IDX		       NO

SQL> select object_id, data_object_id from user_objects where object_name = 'T_UNUSABLE_IDX';

 OBJECT_ID DATA_OBJECT_ID
---------- --------------
     70989

(*)

This is a nice way to speed up large data loads since it will reduce REDO generated plus it’s faster to rebuild an index than to maintain it online.
What I keep forgetting is that if you truncate the table (or the partition) the segment is recreated, hence your data load process will end up maintaining the index online 😦

SQL> truncate table t_unusable;
SQL> select index_name, segment_created from user_indexes where index_name = 'T_UNUSABLE_IDX';

INDEX_NAME		       SEG
------------------------------ ---
T_UNUSABLE_IDX		       YES

SQL> select object_id, data_object_id from user_objects where object_name = 'T_UNUSABLE_IDX';

 OBJECT_ID DATA_OBJECT_ID
---------- --------------
     70989	    70992

----- begin tree dump
leaf: 0x1024b7b 16927611 (0: nrow: 0 rrow: 0)
----- end tree dump

Conclusion (aka note to self): if you plan to use TRUNCATE + mark indexes UNUSABLE remember to TRUNCATE first and mark UNUSABLE after 😀

(*) there is no TREEDUMP here since the trace raised ORA-600 [25027]


Leave a comment

Concurrent INSERT APPEND into the same table

The main reason I decided to have a blog was to share all the small things I learn every day while experimenting but I’ve to admit I fell short on my initial intention, hopefully I’ll do better in the coming months, starting with this super-quick post.

I always knew that concurrent (aka multiple sessions at the same time) INSERT /*+ APPEND */ into the same target table was not possible even if the target table was partitioned. Here is an example of a SQL

create table test_target (n1 number, n2 number)
partition by range(n1)
(partition p1 values less than (5),
 partition p2 values less than (10));

insert /*+ append */ into test_target select mod(rownum,5), rownum from dual connect by rownum <= 10;

Reason for the concurrency being not feasible is the first session acquires an enqueue TM in X mode on the whole table (so every other session will have to wait) even if the inserted data only goes into a specific partition.
In the following extract SID 306 (my session) is holding the enq TM in X mode on object id 138542, which is the table even though only partition P1 has been populated.

SQL> select * from v$lock where sid = 306;

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK     CON_ID
---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ----------
       306 TX     131102      11102          6          0         29          0          0
       306 TM     138542          0          6          0         29          0          3
       306 AE        133          0          4          0         86          0          3

SQL> select object_id, object_name, object_type from user_objects where object_id = 138542;

 OBJECT_ID OBJECT_NAME     OBJECT_TYPE
---------- --------------- -----------------------
    138542 TEST_TARGET     TABLE

If in another session (SID 186) I try to run an INSERT /*+ APPEND */ to populated only P2 then the execution will sit and wait for SID 306 to release the enqueue on object id 138542, as shown in the first row from V$LOCK where 186 wants to grab an enq TM in X mode for object id 138542

SQL> insert /*+ append */ into test_target select mod(rownum,5)+5, rownum from dual connect by rownum  select * from v$lock where sid in (306,186) order by sid;

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK     CON_ID
---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ----------
       186 TM     138542          0          0          6         94          0          3
       186 AE        133          0          4          0        104          0          3
       306 AE        133          0          4          0        640          0          3
       306 TM     138542          0          6          0        583          1          3
       306 TX     131102      11102          6          0        583          0          0 

Nothing new so far, it confirms what I expected.
What I didn’t know is that if “extended partition syntax” is used to specify the target partition (and if you specify the wrong one you get a correct ORA-14401) then the X lock is held on the partition and not the table (it’s held in SX on the table).
Here is the example

SID 306
insert /*+ append */ into test_target partition (p1) select mod(rownum,5), rownum from dual connect by rownum <= 10;

SID 186
insert /*+ append */ into test_target partition (p2) select mod(rownum,5)+5, rownum from dual connect by rownum  select * from v$lock where sid in (306,186) and type = 'TM' order by sid;

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK     CON_ID
---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ----------
       186 TM     138542          0          3          0         11          0          3
       186 TM     138544          0          6          0         11          0          3
       306 TM     138543          0          6          0         35          0          3
       306 TM     138542          0          3          0         35          0          3

SQL> select object_id, object_name, object_type from user_objects where object_id in (138542, 138543, 138544);

 OBJECT_ID OBJECT_NAME    OBJECT_TYPE
---------- -------------- -----------------------
    138544 TEST_TARGET    TABLE PARTITION
    138543 TEST_TARGET    TABLE PARTITION
    138542 TEST_TARGET    TABLE

So in case you know in advance where the data is going (which is generally true specially for range/interval partition) then you can have multiple concurrent loads into the same target table.
I’m not recommending it, just saying it’s possible 😀


Leave a comment

How to quickly identify if a SQL running in parallel has been downgraded using SQLd360

This is just a quick note on how to leverage a couple reports in order to identify if a PX has been downgraded. Those reports have been around for quite some time so this isn’t about any new feature but rather how to use what’s already there 😉

An execution is downgraded when the number of PX slaves used by the SQL at runtime is lower than the requested number, there are few reasons why it can happen with the most common one being the lack of available PX slaves at the time the SQL starts. Such a downgrade can cause from small to catastrophic performance degradation depending on the severity of the downgrades (measured in %) since the CBO generates an execution plan expecting to have some specific horsepower (aka number of PX slaves) at disposal but then such horsepower is reduced. It’s like planning how long it will take to drive from A to B with a Ferrari and then go to the garage and find out you only have a Jetta 😀

SQLd360 aims at making it easier (and quicker) to identify if an execution was downgraded, let’s see how.

The requested DoP for a SQL is stored in the OTHER_XML column, for each PHV/source SQLd360 provides in the “SQL Performance Summary” the MIN and MAX DoP requested, which is likely to stay stable overtime (exception made for Adaptive Degree Policy or Adaptive Multi User, which aren’t that common anyway).
Here is an example of a SQL with a requested DoP of 4:

Screen Shot 2015-08-06 at 8.58.09 AM

Starting from 11gR2 the DoP the SQL executed with can be extracted from column PX_FLAGS in ASH (TRUNC(px_flags / 2097152) as already reported by Randolf here) so for each execution that made it into ASH SQLd360 reports the “execution DoP”.
Here is the output from report “Elapsed Time for Recent Execs”:

Screen Shot 2015-08-06 at 9.18.43 AM

Column MAX_PX_DEGREE is the MAX(DoP) this specific execution used (see note below about why MAX).
Column NUM_PROCESSES is the distinct number of ASH sampled sessions that collaborated to this execution (up to 2*DoP).

From the report above it’s easy to spot how the execution from July 30 was indeed executed at the requested DoP (4) while the one from August 6 was downgraded (requested DoP 4, execution DoP 2).

Note:  ASH reports the DoP by DFO tree so if the plan had multiple DFO trees with different DoP (uncommon but possible) just keep in mind the number you see in the report is the MAX.