Mauro Pagano's Blog

How to find file and block# to dump in Exadata

Leave a comment

Probably because of my past with Oracle Support, I find myself collecting traces and dumps on a regular basis. For example every time I see a row source operation that I would expect to perform only multiblock reads (e.g. Full Table Scan) doing single block ones I collect some block dumps to understand why that is happening. It’s just a matter of looking at the raw SQL trace file (or P1/P2 from ASH for that matter) for the unexpected single block reads, grab file# and block# and dump the block.

Single block reads in Exadata become “cell single block physical read” and the P1/P2 for the wait event don’t show the file#/block# but rather then cellhash# and diskhash# where the data came from. This is a little annoying to me because I can’t just grab cellhash#/diskhash# (plus bytes, that matches with the block size being a single block read) and dump that AS FAR AS I KNOW (corrections are very welcome here).

The way I usually work it around is using with ASM info enabling KFKIO tracing (trace[KFKIO] using the 11g new syntax) before running the “guilty SQL” so that each read now looks like this

...
2016-05-05 19:05:41.351021 : KFK_IODONE_OSS:iop:0x7f1841034000 bufp:0x1124ae000 op:1 au:120555 dsk:0x2 err:1 osderr:0 ossp:0x7f1841166fc8 ossp->errcode_oss_result:(nil) io elapsed time: 0 usec time waited on io: 0 usec
2016-05-05 19:05:41.351052 : kfk_iodone_oss_retry::in iop:0x7f1841034000
2016-05-05 19:05:41.351062 : kfk_iodone_oss_retry::out retry:0
2016-05-05 19:05:41.351071 : KFK:IODONE:iop:0x7f1841034000 wt:1 sstype:1 mastp:(nil) result:1 path:o/192.168.12.5/DATA_CD_01_enkx3cel03 au_offset:2678784 operation:1 reason:3347 iosize:8192 bufp:0x1124ae000
WAIT #139742149463208: nam='cell single block physical read' ela= 346 cellhash#=379339958 diskhash#=3782402265 bytes=8192 obj#=115376 tim=1462493141351085
...

What I care about is:

  • path:o/192.168.12.5/DATA_CD_01_enkx3cel03 -> where the data is stored
  • au:120555 -> the ASM allocation unit where my block is stored in
  • au_offset:2678784 -> how far in the allocation unit my block is

using this info I can go to the ASM instance (reason why ASM instance is becaused  X$KFFXP isn’t exposed in the RDBMS instance) and using the following SQL extract file name and block number

SELECT a.name file_name, 
       xnum_kffxp * (&&au_size./&&db_block_size.) + &&au_offset./&&db_block_size. block_num
  FROM x$kffxp x, 
       v$asm_alias a
 WHERE (x.disk_kffxp, x.group_kffxp) IN (SELECT number_kfdsk, grpnum_kfdsk 
                                           FROM x$kfdsk 
                                          WHERE path_kfdsk = '&&path.')
   AND x.au_kffxp = &&au.
   AND x.number_kffxp = a.file_number
   AND x.incarn_kffxp = a.file_incarnation;

The SQL will ask for 5 parameters, 3 of which are already available while the other two are:

  • au_size -> Allocation Unit size -> V$ASM_DISKGROUP.ALLOCATION_UNIT_SIZE
  • db_block_size -> the DB_BLOCK_SIZE

So for example using the values from above plus 4194304 au_size and 8192 bytes block size the result will be

FILE_NAME                                            BLOCK_NUM
-------------------------------------------------- -----------
USERS.1155.861540195                                   3946823

Disclaimer: the SQL could take a little while since it needs to scan the list of extents.

I know it’s a little convoluted but as of now I haven’t found a more efficient way to identify file/block to dump starting from cellhash#/diskhash#.

Goes without saying I’d love to hear smarter ways to get it done 🙂

UPDATE: Stefan Koehler (who is awesome on top of being a friend) made a very valid point on Twitter, event 10200 can lead to the same info without having to go to the ASM instance. I discarded the event when I started looking at this because it was a little too verbose but then I ended up with KFKIO that is even more verbose!  😦
Stupid me and thanks to Stefan!

An output of event 10200 looks like this

ktrget2(): started for block  0x0005 : 0x05fc3947 objd: 0x0001c2b0
...
WAIT #139650546118344: nam='cell single block physical read' ela= 380 cellhash#=379339958 diskhash#=3782402265 bytes=8192 obj#=115376 tim=1462544482448025

where 05fc3947 is the DBA, 100415815 in decimal, and using DBMS_UTILITY we can extract file and block# from it

SELECT dbms_utility.data_block_address_file(100415815) file_n,
       dbms_utility.data_block_address_block(100415815) block_n
  FROM dual;

    FILE_N    BLOCK_N
---------- ----------
        23    3946823

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s