Mauro Pagano's Blog

Manually holding an enqueue with SystemTap

Leave a comment

This week I stumbled into a system that exposed some contention on an enqueue I never saw before as a cause of troubles, the MS enqueue. The description is pretty straightforward, from V$LOCK_TYPE

SQL> select type, name, id1_tag, id2_tag, description from v$lock_type where type = 'MS';

TYP NAME                           ID1_TAG                        ID2_TAG    DESCRIPTION
--- ------------------------------ ------------------------------ ---------- -----------------------------------------------------------
MS  Materialized View Refresh Log  master object #                0          Lock held during materialized view refresh to setup MV log

so it’s easy to get an idea of which components are involved but I wanted to dig a little more around it.

Spoiler alert, this post is not about enq MS contention 🙂

I wanted to be able to hold an enqueue for as long as I wanted so that I could fire other sessions and see under which conditions they would end up waiting on the enqueue holder. An enqueue version of what is described here for latches (as usual amazing job by Tanel and Andrey!).

There are already a couple of scripts (that I found, there are probably more) for dtrace to achieve similar tasks (here and here), specially the one from Mikhail does exactly what I wanted. Anyway my VM is on Linux OEL7 and I have no dtrace to play with so I went for SystemTap.

This is not as sophisticated as the latch mechanism where using oradebug we can just grab whatever latch, in this case we need to make our session voluntary “grab the resource” and the script will just prevent the session from releasing it. Also I didn’t need all the features in Mikhail script (ie. block it only for a specific obj#) so I kept it simple to start.

I just needed to identify the enqueue for my resource and stop the process from releasing it once done with it (I’ll skip the description of the structures involved since they are already explained in both Tanel’s and Mikhail’s scripts).

The MS is X$KSIRESTYP.INDX 167 in 12.1.0.2

SQL> select rest.indx i, rest.resname type from X$KSIRESTYP rest, X$KSQEQTYP eqt where (rest.inst_id = eqt.inst_id) and   (rest.indx = eqt.indx) and   (rest.indx > 0) and rest.resname = 'MS';

         I TYP
---------- ---
       167 MS

so the script looks something like this

[oracle@localhost systemtap]$ cat trace_eq.stp
global enqAddress
global target_eq = 167;
global getTime

probe process("oracle").function("ksqgtlctx") {
     if (int_arg(5) == target_eq) {
         enqAddress = s64_arg(1);
	 getTime    = gettimeofday_us();
         printf("%d function=%s getting enqueue: pid=%d enqAddress=0x%x locktype#=%d\n", getTime, usymname(ustack(0)), pid(), enqAddress, int_arg(5));
     }
}

probe process("oracle").function("ksqr*") {
   if(s64_arg(1) == enqAddress) {
       releaseTime = gettimeofday_us();
       heldTime = releaseTime - getTime;
       printf("%d function=%s releasing enqueue: pid=%d enqAddress=0x%x held for=%d us \n", releaseTime, usymname(ustack(0)), pid(), s64_arg(1), heldTime);
       raise(%{ SIGSTOP %});
    enqAddress = 0;
   }
}

In a nutshell, the script keeps track of the enqueue address for the enqueue protecting the resource of interest (167 here) and it stops the process (SIGSTOP) once it’s trying to release it. The function uses a wildcard “ksqr*” because different functions release different enqueues (ie. MS is released by ksqrcli, TM is released by ksqrcl).

Assuming my user process is PID 4202 (sid 142), we can make it stop while holding MS just loading the stap script

[oracle@localhost systemtap]$ stap -g trace_eq.stp -x 4202

and trigger a refresh (since the enq MS “has something to do” with MV refresh from what V$LOCK_TYPE told me) from the user session pid 4202

SQL> exec dbms_snapshot.refresh('TEST_MV');

on the stap terminal you’ll see something like

1436194076252628 function=ksqgtlctx getting enqueue: pid=4202 enqAddress=0x86c38a70 locktype#=167
1436194076295783 function=ksqrcli releasing enqueue: pid=4202 enqAddress=0x86c38a70 held for=43155 us

and the session PID 4202 will indeed be stopped, while holding the MS enqueue

SQL> select * from v$lock where sid = 142;

       SID TYP        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------- --- ---------- ---------- ---------- ---------- ---------- ----------
       142 AE         133          0          4          0      69961          0
       142 MS       96149          0          6          0        352          0
       142 JI      103680          0          6          0        353          0

Now I can fire all the sessions I want and see under which conditions they get blocked, under which stack, etc etc.
To make the user session resume just CTRL+C the stap script and send a SIGCONT signal to the process (kill -s SIGCONT).

This is just a very raw script to make a process stop while holding an enqueue, it has a lot of limitations and several refinements should be implemented to match the functionalities of the other scripts mentioned before, so don’t expect much from it 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s