基于CBO优化器谓词选择率的计算方法

选择率selectivity是Oracle的CBO中比较重要的一个概念。和selectivity经常一起提到的一个概念是cardinality，其实这两个概念的本质上是一样的。selectivity是指一个SQL操作的得出结果集占原来结果集的百分比，而cardinality就是指一个SQL操作的得出结果集的行数，CBO是通过selectivity来计算cardinality的，也就是说cardinality=selectivity*原结果集行数。
创建一个测试表

create table t1 as
select
trunc(dbms_random.value(1,13)) month_no
from all_objects
where rownum< =1200;


sys@JINGYONG> select high_value,low_value,num_distinct,num_nulls,density
  2  from dba_tab_col_statistics where owner='SYS' and table_name='t1'
  3  and column_name='MONTH_NO';

HIGH_VALUE  LOW_VALUE    NUM_DISTINCT  NUM_NULLS    DENSITY
C10D        C102         12            0            .083333333

sys@JINGYONG> select blocks,num_rows from user_tables where table_name='t1
';

    BLOCKS   NUM_ROWS
---------- ----------
         2       1200

sys@JINGYONG> show parameter db_file_multiblock_read_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_file_multiblock_read_count        integer     40

sys@JINGYONG> show parameter db_block_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_block_size                        integer     8192

选择率=所需的空间除以总可用空间
通过计算(user_tab_col_statistics.high_value – user_tab_col_statistics.low_value)可以得出,上面的例子中整个区间大小为11,当计算得到的是11,那么我们就知道其中肯定有错误.因为我们知道,上例中有12个离散的月份,但优化器采用的算法却将这些数据看作连续的变化数据,并且总区间大小为11.
单列谓词的选择率
示例1
select count(*) from t1 where month_no>8

sys@JINGYONG> select count(*) from t1 where month_no>8;

  COUNT(*)
----------
       396

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  6u62fruy6276s, child number 0
-------------------------------------
select count(*) from t1 where month_no>8

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL|       T1 |   436 |  1308 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("MONTH_NO">8)

当where month_no>8:这是一个无边界(一侧没有边界)的开(不包含8)区间.
选择率=(high_value-limit)/(high_value-low_value)=(12-8)/(12-1)=4/11
基数=1200*4/11=436.363636

成本是2
全表扫描的成本计算公式:
Cost = (
#SRds * sreadtim +
#MRds * mreadtim +
#CPUCycles / cpuspeed
) / sreadtim
where
#SRDs – number of single block reads
#MRDs – number of multi block reads
#CPUCycles – number of CPU Cycles
sreadtim – single block read time
mreadtim – multi block read time
cpuspeed – CPU cycles per second

成本指的是花费在块读取上的时间,加上花在多块读取上的时间,再加上所需的cpu处理的时间,然后将总和除以单块读取所花费的时间,也就是说,成本是语句的预计执行时间的总和,以单块读取时间单元的形式来表示.
如果oracle收集了操作系统统计信息，那么CBO采用工作量统计模式计算代价
如果oracle没有收集操作系统统计信息，那么CBO采用非工作量统计模式计算代价我们现在处于“非工作量统计模式”

sys@JINGYONG> select pname,pval1 from aux_stats$;

PNAME                               PVAL1
------------------------------ ----------
STATUS
DSTART
DSTOP
FLAGS                                   0
CPUSPEEDNW                       1149.062
IOSEEKTIM                              10
IOTFRSPEED                           4096
SREADTIM
MREADTIM
CPUSPEED
MBRC
MAXTHR
SLAVETHR

#SRds=0,因为是全表扫描，单块读为0，全都使用的是多块读
#MRds=表的块数/多块读参数=2/40=0.05
mreadtim=ioseektim+(db_file_multiblock_count*db_block_size/iotftspeed)=10+(40*8192/4096)=90
sreadtim=ioseektim+(db_block_size/iotfrspeed)=10+(8192/4096)=12
CPUCycles 等于 PLAN_TABLE里面的CPU_COST

sys@JINGYONG> explain plan for select count(*) from t1 where month_no>8;

已解释。

sys@JINGYONG> select cpu_cost from plan_table;

  CPU_COST
----------
    254243

cpuspeed 等于 CPUSPEEDNW= 1149.062
COST=(0*12/12)+(0.05*90/12)+(254243/1149.062/12/1000)(毫秒换算成秒)=
0+0.375+0.01843=0.39343
0.375是IO成本
0.01843是CPU成本
手工计算出来的COST取最接近的整数等于1和我们看到的2有差别
这是由于隐含参数_table_scan_cost_plus_one参数造成的

SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ FROM x$ksppi x,x$ksppcv y
  2    WHERE x.inst_id = USERENV ('Instance')
  3     AND y.inst_id = USERENV ('Instance')
  4     AND x.indx = y.indx
  5     AND x.ksppinm LIKE '%_table_scan_cost_plus_one%';

NAME                                                                             VALUE                                                                              DESCRIB
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
 --------------------------------------------------------------------------------
_table_scan_cost_plus_one                                                        TRUE                                                                             bump estimated full table scan and index ffs cost by one

根据该参数的描述，在table full scan和index fast full scan的时候会将cost+1 即 1+1=2;

示例2
select count(*) from t1 where month_no>=8
这是一个无边界的闭(包含8)区间,因此需要对算法进行调整.具体调整过程为必须包括闭合值(此处即为8)所在的行–也就是要增加一个1/num_distinct(要注意的是oracle8i使用的是density而不是1/num_distinct,但是,如果没有获取统计信息或者没有直方图可用,那么无须注意这一差别)
选择率=(high_value-limit)/(high_value-low_value)+1/num_distinct=4/11+1/12
基数=1200*(4/11+1/12)=536.363636

sys@JINGYONG> show parameter db_file_multiblock_read_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_file_multiblock_read_count        integer     40

sys@JINGYONG> select count(*) from t1 where month_no>=8;

  COUNT(*)
----------
       497

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  g748ttx5rv2p1, child number 0
-------------------------------------
select count(*) from t1 where month_no>=8

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   536 |  1608 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("MONTH_NO">=8)

示例3
select count(*) from t1 where month_no<8
选择率=(limit-low_value)/(high_value-low_value)=(8-1)/(12-1)=7/11
基数=1200*7/11=763.636364

sys@JINGYONG> select count(*) from t1 where month_no<8;

  COUNT(*)
----------
       703

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
----------------------------------------
SQL_ID  cpvw8yxstbtng, child number 0
-------------------------------------
select count(*) from t1 where month_no<8

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   764 |  2292 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("MONTH_NO"<8)

示例4
select count(*) from t1 where month_no< =8 选择率=(limit-low_value)/(high_value-low_value)+1/num_distinct=(8-1)/(12-1)+1/12 基数=1200*(7/11+1/12)=863.636364

sys@JINGYONG> select count(*) from t1 where month_no< =8;

  COUNT(*)
----------
       804

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
——————————————————————————–

SQL_ID  buhw0y52jy3nr, child number 0
————————————-
select count(*) from t1 where month_no< =8

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   864 |  2592 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("MONTH_NO"<=8)

示例5
select count(*) from t1 where month_no between 6 and 9
两边都是有界的闭区间它其实与
select count(*) from t1 where month_no>=6 and month_no< =9是等价的.它们给出了两个闭合值.因此要对算法进行两次调整. 选择率=(high_limit-low_limit)/(high_value-low_value)+1/num_distinct+1/num_distinct=(9-6)/(12-1)+1/12+1/12 基数=1200*(3/11+1/6)=527.272727

sys@JINGYONG> select count(*) from t1 where month_no between 6 and 9;

  COUNT(*)
———-
       421

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
——————————————————————————–
SQL_ID  83ud81y133kxm, child number 0
————————————-
select count(*) from t1 where month_no between 6 and 9

Plan hash value: 3337892515

——————————————————————————-
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
——————————————————————————-
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   527 |  1581 |     2   (0)| 00:00:01 |
——————————————————————————-

Predicate Information (identified by operation id):
—————————————————



sys@JINGYONG> select count(*) from t1 where month_no>=6 and month_no< =9;

  COUNT(*)
----------
       421

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
——————————————————————————–
SQL_ID  74gjsbjjdv97k, child number 0
————————————-
select count(*) from t1 where month_no>=6 and month_no< =9

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   527 |  1581 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("MONTH_NO">=6 AND “MONTH_NO”< =9))

示例7
select count(*) from t1 where month_no>=6 and month_no<9
选择率=(high_limit-low_limit)/(high_value-low_value)+1/num_distinct=(9-6)/(12-1)+1/12
基数=1200*(3/11+1/12)=427.272727

sys@JINGYONG> select count(*) from t1 where month_no>=6 and month_no<9;

  COUNT(*)
----------
       328

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

SQL_ID  5w0n5237vpngb, child number 0
-------------------------------------
select count(*) from t1 where month_no>=6 and month_no<9

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   427 |  1281 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("MONTH_NO">=6 AND "MONTH_NO"<9))

示例8
select count(*) from t1 where month_no>6 and month_no< =9 选择率=(high_limit-low_limit)/(high_value-low_value)+1/num_distinct=(9-6)/(12-1)+1/12 基数=1200*(3/11+1/12)=427.272727

sys@JINGYONG> select count(*) from t1 where month_no>6 and month_no< =9;

  COUNT(*)
----------
       303

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
——————————————————————————–

SQL_ID  bbd2wxjratndg, child number 0
————————————-
select count(*) from t1 where month_no>6 and month_no< =9

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   427 |  1281 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("MONTH_NO">6 AND “MONTH_NO”< =9))

示例9
select count(*) from t1 where month_no>6 and month_no<9
选择率=(high_limit-low_limit)/(high_value-low_value)=(9-6)/(12-1)
基数=1200*(3/11)=327.272727

sys@JINGYONG> select count(*) from t1 where month_no>6 and month_no<9;

  COUNT(*)
----------
       210

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

SQL_ID  6axnk33swbtpw, child number 0
-------------------------------------
select count(*) from t1 where month_no>6 and month_no<9

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   327 |   981 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("MONTH_NO">6 AND "MONTH_NO"<9))

对于使用绑定变量的谓词优化器就无法知道任何变量的值也不知道绑定变量的类型,因此,这种情况下优化器
就会使用一个固定的选择率来产生执行计划.例如:

sys@JINGYONG> explain plan for select count(*) from t1 where month_no>:b1;


已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |     3 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("MONTH_NO">TO_NUMBER(:B1))



sys@JINGYONG> explain plan for select count(*) from t1 where month_no between :b1 and :b2;

已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 57893822

--------------------------------------------------------------------------------

| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |          |     1 |     3 |     2   (0)| 00:00:01 |

|   1 |  SORT AGGREGATE     |          |     1 |     3 |            |          |

|*  2 |   FILTER            |          |       |       |            |          |

|*  3 |    TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |

--------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:B1)< =TO_NUMBER(:B2))
   3 - filter("MONTH_NO">=TO_NUMBER(:B1) AND "MONTH_NO"< =TO_NUMBER(:B2))



sys@JINGYONG> explain plan for select count(*) from t1 where month_no>=:b1
;

已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |     3 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("MONTH_NO">=TO_NUMBER(:B1))

已选择14行。

sys@JINGYONG> explain plan for select count(*) from t1 where month_no<:b1 ;


已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |     3 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("MONTH_NO" explain plan for select count(*) from t1 where month_no< =:b1
;

已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |     3 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("MONTH_NO"< =TO_NUMBER(:B1))

已选择14行。

sys@JINGYONG> explain plan for select count(*) from t1 where month_no>=:b1
 and month_no< =:b2;

已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 57893822

--------------------------------------------------------------------------------

| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |          |     1 |     3 |     2   (0)| 00:00:01 |

|   1 |  SORT AGGREGATE     |          |     1 |     3 |            |          |

|*  2 |   FILTER            |          |       |       |            |          |

|*  3 |    TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |

--------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:B1)< =TO_NUMBER(:B2))
   3 - filter("MONTH_NO">=TO_NUMBER(:B1) AND "MONTH_NO"< =TO_NUMBER(:B2))

已选择16行。



sys@JINGYONG> explain plan for select count(*) from t1 where month_no>=:b1
 and month_no<:b2 ;

已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 57893822

--------------------------------------------------------------------------------

| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |          |     1 |     3 |     2   (0)| 00:00:01 |

|   1 |  SORT AGGREGATE     |          |     1 |     3 |            |          |

|*  2 |   FILTER            |          |       |       |            |          |

|*  3 |    TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |

--------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:B1)=TO_NUMBER(:B1) AND "MONTH_NO" explain plan for select count(*) from t1 where month_no>:b1
and month_no<:b2 ;

已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 57893822

--------------------------------------------------------------------------------

| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |          |     1 |     3 |     2   (0)| 00:00:01 |

|   1 |  SORT AGGREGATE     |          |     1 |     3 |            |          |

|*  2 |   FILTER            |          |       |       |            |          |

|*  3 |    TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |

--------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:B1)TO_NUMBER(:B1) AND "MONTH_NO" explain plan for select count(*) from t1 where month_no>:b1
and month_no< =:b2;

已解释。

sys@JINGYONG> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 57893822

--------------------------------------------------------------------------------

| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |          |     1 |     3 |     2   (0)| 00:00:01 |

|   1 |  SORT AGGREGATE     |          |     1 |     3 |            |          |

|*  2 |   FILTER            |          |       |       |            |          |

|*  3 |    TABLE ACCESS FULL| T1 |   101 |   303 |     2   (0)| 00:00:01 |

--------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER(:B1)TO_NUMBER(:B1) AND "MONTH_NO"< =TO_NUMBER(:B2))

对于上面所有使用绑定变量的查询其评估的基数都是101使用固定选择率

当查询超出列的最低/最高界限测试的结果如下.其评估的基数是100使用固定选择率

sys@JINGYONG> select count(*) from t1 where month_no between 14 and 17;

  COUNT(*)
----------
         0

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  8baayn49yujkh, child number 0
-------------------------------------
select count(*) from t1 where month_no between 14 and 17

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   100 |   300 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("MONTH_NO">=14 AND "MONTH_NO"< =17))


已选择19行。

sys@JINGYONG> select count(*) from t1 where month_no between 18 and 21;

  COUNT(*)
----------
         0

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  9agzhp783caa1, child number 0
-------------------------------------
select count(*) from t1 where month_no between 18 and 21

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   100 |   300 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("MONTH_NO">=18 AND "MONTH_NO"< =21))


已选择19行。

sys@JINGYONG> select count(*) from t1 where month_no between 24 and 27;

  COUNT(*)
----------
         0

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  dksv39k0vx5sh, child number 0
-------------------------------------
select count(*) from t1 where month_no between 24 and 27

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   100 |   300 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("MONTH_NO">=24 AND "MONTH_NO"< =27))


已选择19行。

sys@JINGYONG> select count(*) from t1 where month_no between -11 and 0;

  COUNT(*)
----------
         0

sys@JINGYONG> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  drdqd21q24hzy, child number 0
-------------------------------------
select count(*) from t1 where month_no between -11 and 0

Plan hash value: 3337892515

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          |
|*  2 |   TABLE ACCESS FULL| T1 |   100 |   300 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("MONTH_NO"< =0 AND "MONTH_NO">=(-11)))


已选择19行。

双谓词选择率
为了能够计算通用的联合谓词选择率,需要用到以下3种基本公式,它们是以选择率的形式给出的,而不是以基数的形式给出的.
(谓词1 and 谓词2)的选择率=谓词1的选择率+谓词2的选择率
(谓词1 or 谓词2)的选择率=谓词1的选择率+谓词2的选择率-(谓词1 and 谓词2)的选择率
(not 谓词1)的选择率=1-谓词1的选择率,绑定变量不能使用这个公式进行计算

下面是一组概率论方面的公式
p(a and b)=p(a)*p(b)
p(a or b)=p(a)+p(b)-p(a and b)=p(a)+p(b)-p(a)*p(b)
p(not a)=1-p(a)
上面的联合谓词选择率与概率论公式是等价的.

首先,我们来细致研究如下的where子句:
select count(*) from T1 where month_no>8 or month_no< =8; 利用简单的选择率计算公式required range/total range可以得出谓词1的选择率为(12-8)/(12-1)=4/11=0.363636 同样的.谓词2的选择率为(8-1)/(12-1)+1/12=7/11+1/12=0.719696 (p1 or p2)的选择率计算公式为p1的选择率+p2的选择率-(p1 and p2)的选择率,因此,联合的选择率为 0.363636+0.719696-(0.363636*0.719696)=0.8216 将这个选择率乘以1200行,进行四舍五入后,得到结果后为986,很明显,这个结果和人的直观思维得到的结果不一样. 当我们将多个谓词应用到某个表时,需要确定测试的列之间是否存在依赖关系.如果存在依赖关系,优化器得到的选择率将是错误的,基数也就是错误的,因此也就无法得到合适的执行计划. 小结: 为了估计一组谓词返回数据的行数,优化器首先计算选择率(返回数据行数的分数),然后将其与输入行数相乘.对于单个列上的单个谓词，优化器将利用不同值的数目或density作为计算谓词选择率的基础.对于单个列上基于的区间谓词,优化器采用分数required range/total available range进行一些端点值调整的方式来计算谓词选择率.对于包含绑定变量的区间谓词来说,优化器使用硬编码常量作为选择率优化器通过使用类似于计算独立事件的联合概率方式的公式来计算联合谓词的选择率.如果列中包含的数据集并不是相互独立的话,在选择率的计算方面将会出现误差,从而导致基数出现误差.

发表评论 取消回复

发表评论取消回复