Mauro Pagano's Blog

Histograms come histograms go

1 Comment

Actually the title is likely incorrect, in the sense that I think histograms don’t “go” once they come. Or at least I’ve never seen a case where consistently gathering statistics using method_opt “SIZE AUTO” makes a histogram go away sometime after the same syntax made it show up (while it’s common for a histogram to just show up sometime in the future because of column usage).

Recently I’ve seen several systems where custom statistics gathering procedures were coupled with the automatic stats gathering job (voluntarily or not is another story… ūüėÄ ) triggering all sort of weirdness in the statistics. Reason was¬†a syntax mismatch between¬†the custom job (method_opt “SIZE 1”) and¬†the automatic job (default is¬†method_opt “SIZE AUTO”).

A reliable way to figure out if histograms popped up / disappeared is to look at WRI$_OPTSTAT_HISTGRM_HISTORY, too bad this table is usually large and expensive to access (and not partitioned).
Another table is available though, table WRI$_OPTSTAT_HISTHEAD_HISTORY, which is the historical version of the dictionary table HIST_HEAD$ (1 row per histogram) but while in the dictionary there are BUCKET_CNT and ROW_CNT columns (so it’s easy to spot a histogram)¬†such columna are gone once the info are stored in WRI$.¬†Wouldn’t it be nicer to just have a column in¬†WRI$_OPTSTAT_HISTHEAD_HISTORY saying “histogram was in place” ?

Let’s see if we can figure out a (approximate) way that can save us a trip to¬†WRI$_OPTSTAT_HISTGRM_HISTORY.

Table WRI$_OPTSTAT_HISTHEAD_HISTORY has a column FLAGS that is populated using the following expression in 11.2.0.4 (extracted from a 10046 trace of DBMS_STATS.GATHER_TABLE_STATS)


bitand(h.spare2,7) + 8 + decode(h.cache_cnt,0,0,64)

while it’s tricky and risky to guess¬†what SPARE2 might store it should be pretty easy to play with CACHE_CNT, we just need to gather stats with and without histograms to see how the value changes.

SQL> create table test_flag (n1 number);
SQL> insert into test_flag select mod(rownum,200) n1 from dual connect by rownum <= 10000; SQL> insert into test_flag select * from test_flag;
--repeat it a few times to make the data grow
SQL> commit;
SQL> exec dbms_stats.gather_table_stats('MPAGANO','TEST_FLAG',method_opt=>'FOR ALL COLUMNS SIZE 254');
SQL> select obj#, col#, bucket_cnt, row_cnt, cache_cnt, null_cnt, sample_size from sys.hist_head$ where obj# in (select object_id from dba_objects where object_name = 'TEST_FLAG');

      OBJ#       COL# BUCKET_CNT    ROW_CNT  CACHE_CNT   NULL_CNT SAMPLE_SIZE
---------- ---------- ---------- ---------- ---------- ---------- -----------
     24897          1       5487        200         10          0        5487

SQL> exec dbms_stats.gather_table_stats('MPAGANO','TEST_FLAG',method_opt=>'FOR ALL COLUMNS SIZE 1');
SQL> select obj#, col#, bucket_cnt, row_cnt, cache_cnt, null_cnt, sample_size from sys.hist_head$ where obj# in (select object_id from dba_objects where object_name = 'TEST_FLAG');

      OBJ#       COL# BUCKET_CNT    ROW_CNT  CACHE_CNT   NULL_CNT SAMPLE_SIZE
---------- ---------- ---------- ---------- ---------- ---------- -----------
     24897          1          1          0          0          0     2560000

In the first run we gathered stats with “SIZE AUTO” and a 200 buckets histogram was created, sample size ~=5500¬†is not surprising and it matches with AUTO_SAMPLE_SIZE used. In this case CACHE_CNT was 10.

In the second run we gathered stats with “SIZE 1” and no histogram was created. In this case CACHE_CNT was 0.

I couldn’t find with 100% accuracy what CACHE_CNT represents (actually if any reader knows it please comment this post) but my educated¬†guess is that’s number of entries to be cached in the row cache; unfortunately this is really just a guess since I didn’t even find a parameter to change this value (again, whoever knows better please let me know).

I ran multiple tests and it seems like CACHE_CNT has a value larger than 0 only when a histogram is present (assuming my guess of what is means is accurate), thus the value of WRI$_OPTSTAT_HISTHEAD_HISTORY.FLAG is bumped by 64 only when a histogram is present.
With this info in mind it’s now pretty easy to spot if a histogram was present in the past just looking at¬†¬†WRI$_OPTSTAT_HISTHEAD_HISTORY.FLAG without having to access the larger¬†WRI$_OPTSTAT_HISTGRM_HISTORY.

SQLd360 v1526 has a new column HAD_HISTOGRAM added to the Column Statistics Version report built exactly on the assumptions made here.

Feedbacks welcome as usual, especially if they can prove wrong what I just said ūüėÄ

One thought on “Histograms come histograms go

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s