最近在优化某系统中发现许多SQL语句在书写谓词条件(wheret条件)时完全不根据表结构定义的字段数据类型来,而是随意书写谓词条件,这样造成原来能走正确索引的结果不能使用该索引,其结果就是查询语句的性能很差,这里将我所遇到的两种情况介绍一下.
第一种情况是谓词条件进行了数据类型的转换转换使得CBO无法使用索引:
其SQL语句如下所示,该SQL的功能是统计一年社保中心一年内由于各种伤害或骨折所发生的医疗费用
select a.hospital_id,
c.hospital_name,
count(distinct a.serial_no) rc,
round(sum(b.real_pay), 2) ylfyze,
round(sum(case
when b.fund_id in ('001') then
b.real_pay
else
0
end),
2) tczc,
round(sum(case
when b.fund_id in ('201') then
b.real_pay
else
0
end),
2) zffy,
round(sum(case
when b.fund_id in ('003', '999') then
b.real_pay
else
0
end),
2) yyzf
from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d
where a.hospital_id = b.hospital_id
and a.serial_no = b.serial_no
and a.hospital_id = c.hospital_id
and a.fin_disease = d.icd
and d.center_id = a.center_id
and a.valid_flag = 1
and b.valid_flag = 1
and a.biz_type = 12
and a.pers_type in (1, 2)
and (d.disease like '%伤%' or d.disease like '%骨折%')
and a.center_id = '430740'
and to_char(a.fin_date, 'yyyymmdd') >= '20140101'
and to_char(a.fin_date, 'yyyymmdd') < = '20141231'
group by a.hospital_id, c.hospital_name
order by a.hospital_id
上述SQL执行情况如下,其执行时间为4分40秒
SQL> set timing on
SQL> set autotrace traceonly
SQL> select c.hospital_id,
2 c.hospital_name,
3 count(distinct a.serial_no) rc,
4 round(sum(b.real_pay), 2) ylfyze,
5 round(sum(case
6 when b.fund_id in ('001') then
7 b.real_pay
8 else
9 0
10 end),
11 2) tczc,
12 round(sum(case
13 when b.fund_id in ('201') then
14 b.real_pay
15 else
16 0
17 end),
18 2) zffy,
19 round(sum(case
20 when b.fund_id in ('003', '999') then
21 b.real_pay
22 else
23 0
24 end),
25 2) yyzf
26 from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d
27 where a.hospital_id = b.hospital_id
28 and a.serial_no = b.serial_no
29 and a.hospital_id = c.hospital_id
30 and a.fin_disease = d.icd
31 and d.center_id = a.center_id
32 and a.valid_flag = 1
33 and b.valid_flag = 1
34 and a.biz_type = 12
35 and a.pers_type in (1, 2)
36 and (d.disease like '%伤%' or d.disease like '%骨折%')
37 and a.center_id = '430740'
38 and to_char(a.fin_date, 'yyyymmdd') >= '20140101'
39 and to_char(a.fin_date, 'yyyymmdd') < = '20141231'
40 group by c.hospital_id, c.hospital_name
41 order by c.hospital_id
42 ;
Elapsed: 00:04:39.59
Execution Plan
----------------------------------------------------------
Plan hash value: 1467084556
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 148 | 4254 (20)| 00:00:04 |
| 1 | SORT GROUP BY | | 1 | 148 | 4254 (20)| 00:00:04 |
|* 2 | TABLE ACCESS BY INDEX ROWID | MT_PAY_RECORD_FIN | 1 | 31 | 1 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 148 | 4252 (20)| 00:00:04 |
| 4 | NESTED LOOPS | | 1 | 117 | 4251 (20)| 00:00:04 |
| 5 | NESTED LOOPS | | 3 | 252 | 4250 (20)| 00:00:04 |
| 6 | INDEX FULL SCAN | IDX_BS_HOSPITAL_NAME | 1227 | 39264 | 2 (0)| 00:00:01 |
|* 7 | TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN | 1 | 52 | 3 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | PK_MT_BIZ_FIN | 1 | | 3 (0)| 00:00:01 |
|* 9 | TABLE ACCESS BY INDEX ROWID | BS_DISEASE | 1 | 33 | 1 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | INX_BS_DISEASE_01 | 1 | | 1 (0)| 00:00:01 |
|* 11 | INDEX RANGE SCAN | I_MT_PAY_RECORD_FIN_1| 1 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER("B"."VALID_FLAG")=1)
7 - filter(TO_NUMBER("A"."VALID_FLAG")=1 AND (TO_NUMBER("A"."PERS_TYPE")=1 OR
TO_NUMBER("A"."PERS_TYPE")=2))
8 - access("A"."HOSPITAL_ID"="C"."HOSPITAL_ID" AND "A"."CENTER_ID"='430740')
filter("A"."CENTER_ID"='430740' AND TO_NUMBER("A"."BIZ_TYPE")=12 AND
TO_CHAR(INTERNAL_FUNCTION("A"."FIN_DATE"),'yyyymmdd')>='20140101' AND
TO_CHAR(INTERNAL_FUNCTION("A"."FIN_DATE"),'yyyymmdd')< ='20141231')
9 - filter("D"."DISEASE" LIKE '%伤%' OR "D"."DISEASE" LIKE '%骨折%')
10 - access("D"."CENTER_ID"='430740' AND "A"."FIN_DISEASE"="D"."ICD")
11 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
161233 consistent gets
83048 physical reads
624 redo size
1197 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
5 rows processed
上述SQL对于表BS_HOSPITAL只查询了hospital_name列,而在BS_HOSPITAL表中存在索引IDX_BS_HOSPITAL_NAME(hospital_name,hospital_id)所以首先对IDX_BS_HOSPITAL_NAME索引全扫描这样就不用再回表查询从索引中就是得到hospital_name列的值作为结果集1。再通过对MT_BIZ_FIN表执行索引(PK_MT_BIZ_FIN)范围扫描,再回表查询返回其记录作为结果集2,再以结果集1作为驱动表进行嵌套循环连接。再与表BS_DISEASE,I_MT_PAY_RECORD_FIN_1执行嵌套循环连接,再执行分组排序。其实在MT_BIZ_FIN表中存在复合索引INDI_MT_BIZ_FIN_F_H(FIN_DATE,HOSPITAL_ID,BIZ_TYPE, TREATMENT_TYPE, CENTER_ID),而查询条件中用到了find_date,hospital_id,biz_type,center_id,只是这里因为谓词条件中对于fin_date条件是to_char(a.fin_date, 'yyyymmdd') >= '20140101' and to_char(a.fin_date, 'yyyymmdd') < = '20141231',而fin_date(费用完成时间)是日期类型,这里将find_date转换成字符型所以没有办法使用索引INDI_MT_BIZ_FIN_F_H。 将to_char(a.fin_date, 'yyyymmdd') >= '20140101' and to_char(a.fin_date, 'yyyymmdd') < = '20141231'条件改写成 a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd') ,改写后其SQL语句如下所示:
select c.hospital_id,
c.hospital_name,
count(distinct a.serial_no) rc,
round(sum(b.real_pay), 2) ylfyze,
round(sum(case
when b.fund_id in ('001') then
b.real_pay
else
0
end),
2) tczc,
round(sum(case
when b.fund_id in ('201') then
b.real_pay
else
0
end),
2) zffy,
round(sum(case
when b.fund_id in ('003', '999') then
b.real_pay
else
0
end),
2) yyzf
from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d
where a.hospital_id = b.hospital_id
and a.serial_no = b.serial_no
and a.hospital_id = c.hospital_id
and a.fin_disease = d.icd
and d.center_id = a.center_id
and a.valid_flag = 1
and b.valid_flag = 1
and a.biz_type = 12
and a.pers_type in (1, 2)
and (d.disease like '%伤%' or d.disease like '%骨折%')
and a.center_id = '430740'
and a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd')
group by c.hospital_id, c.hospital_name
order by c.hospital_id
来实际执行一次,其执行结果如下所示,现在执行时间稳定在1-2秒之间,能满足客户要求。
SQL> set autotrace traceonly
SQL> select c.hospital_id,
2 c.hospital_name,
3 count(distinct a.serial_no) rc,
4 round(sum(b.real_pay), 2) ylfyze,
5 round(sum(case
6 when b.fund_id in ('001') then
7 b.real_pay
8 else
9 0
10 end),
11 2) tczc,
12 round(sum(case
13 when b.fund_id in ('201') then
14 b.real_pay
15 else
16 0
17 end),
18 2) zffy,
19 round(sum(case
20 when b.fund_id in ('003', '999') then
21 b.real_pay
22 else
23 0
24 end),
25 2) yyzf
26 from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d
27 where a.hospital_id = b.hospital_id
28 and a.serial_no = b.serial_no
29 and a.hospital_id = c.hospital_id
30 and a.fin_disease = d.icd
31 and d.center_id = a.center_id
32 and a.valid_flag = 1
33 and b.valid_flag = 1
34 and a.biz_type = 12
35 and a.pers_type in (1, 2)
36 and (d.disease like '%伤%' or d.disease like '%骨折%')
37 and a.center_id = '430740'
38 and a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd')
39 group by c.hospital_id, c.hospital_name
40 order by c.hospital_id
41 ;
Elapsed: 00:00:01.02
Execution Plan
----------------------------------------------------------
Plan hash value: 1467084556
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 2516 | 1529 (15)| 00:00:02 |
| 1 | SORT GROUP BY | | 17 | 2516 | 1529 (15)| 00:00:02 |
|* 2 | TABLE ACCESS BY INDEX ROWID | MT_PAY_RECORD_FIN | 1 | 31 | 1 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 17 | 2516 | 1528 (15)| 00:00:02 |
| 4 | NESTED LOOPS | | 33 | 3861 | 1521 (15)| 00:00:02 |
| 5 | NESTED LOOPS | | 354 | 29736 | 1450 (16)| 00:00:02 |
| 6 | INDEX FULL SCAN | IDX_BS_HOSPITAL_NAME | 1227 | 39264 | 2 (0)| 00:00:01 |
|* 7 | TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN | 1 | 52 | 1 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | INDI_MT_BIZ_FIN_F_H | 1 | | 1 (0)| 00:00:01 |
|* 9 | TABLE ACCESS BY INDEX ROWID | BS_DISEASE | 1 | 33 | 1 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | INX_BS_DISEASE_01 | 1 | | 1 (0)| 00:00:01 |
|* 11 | INDEX RANGE SCAN | I_MT_PAY_RECORD_FIN_1| 1 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER("B"."VALID_FLAG")=1)
7 - filter(TO_NUMBER("A"."VALID_FLAG")=1 AND (TO_NUMBER("A"."PERS_TYPE")=1 OR
TO_NUMBER("A"."PERS_TYPE")=2))
8 - access("A"."HOSPITAL_ID"="C"."HOSPITAL_ID" AND "A"."FIN_DATE">=TO_DATE(' 2014-01-01
00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "A"."CENTER_ID"='430740' AND "A"."FIN_DATE"< =TO_DATE('
2014-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
filter("A"."CENTER_ID"='430740' AND TO_NUMBER("A"."BIZ_TYPE")=12)
9 - filter("D"."DISEASE" LIKE '%伤%' OR "D"."DISEASE" LIKE '%骨折%')
10 - access("D"."CENTER_ID"='430740' AND "A"."FIN_DISEASE"="D"."ICD")
11 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
71411 consistent gets
0 physical reads
0 redo size
1197 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
5 rows processed
第二种谓词条件的数据类型隐式转换无法使用索引的情况,其原始SQL语句如下所示,查询一个医疗机构的费用支出情况
select a.hospital_id,
count(distinct a.serial_no) rc,
round(sum(b.real_pay), 2) ylfyze,
round(sum(case
when b.fund_id in ('001') then
b.real_pay
else
0
end),
2) tczc,
round(sum(case
when b.fund_id in ('201') then
b.real_pay
else
0
end),
2) zffy,
round(sum(case
when b.fund_id in ('003', '999') then
b.real_pay
else
0
end),
2) yyzf
from mt_biz_fin a, mt_pay_record_fin b
where a.hospital_id = b.hospital_id
and a.serial_no = b.serial_no
and a.valid_flag = '1'
and b.valid_flag = '1'
and a.biz_type = '12'
and a.pers_type in ('1', '2')
and b.hospital_id=4307000231
group by a.hospital_id
该SQL的执行计划如下所示,执行了1分22秒:
SQL> set autotrace traceonly
SQL> select a.hospital_id,
2 count(distinct a.serial_no) rc,
3 round(sum(b.real_pay), 2) ylfyze,
4 round(sum(case
5 when b.fund_id in ('001') then
6 b.real_pay
7 else
8 0
9 end),
10 2) tczc,
11 round(sum(case
12 when b.fund_id in ('201') then
13 b.real_pay
14 else
15 0
16 end),
17 2) zffy,
18 round(sum(case
19 when b.fund_id in ('003', '999') then
20 b.real_pay
21 else
22 0
23 end),
24 2) yyzf
25 from mt_biz_fin a, mt_pay_record_fin b
26 where a.hospital_id = b.hospital_id
27 and a.serial_no = b.serial_no
28 and a.valid_flag = ‘1’
29 and b.valid_flag = ‘1’
30 and a.biz_type = ‘12’
31 and a.pers_type in ('1', '2')
32 and b.hospital_id=4307000231
33 group by a.hospital_id
34 ;
no rows selected
Elapsed: 00:01:22.20
Execution Plan
----------------------------------------------------------
Plan hash value: 3673479381
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 61 | 127K (16)| 00:01:56 |
| 1 | SORT GROUP BY | | 1 | 61 | 127K (16)| 00:01:56 |
|* 2 | TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN | 1 | 30 | 1 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 45 | 2745 | 127K (16)| 00:01:56 |
|* 4 | TABLE ACCESS FULL | MT_PAY_RECORD_FIN | 8327 | 252K| 123K (16)| 00:01:53 |
|* 5 | INDEX RANGE SCAN | PK_MT_BIZ_FIN | 1 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"."BIZ_TYPE"='12’ AND "A"."VALID_FLAG"='1’ AND
("A"."PERS_TYPE"='1’ OR "A"."PERS_TYPE"='2’)
4 - filter(TO_NUMBER("B"."HOSPITAL_ID")=4307000231 AND "B"."VALID_FLAG"='1')
5 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
572386 consistent gets
383935 physical reads
0 redo size
638 bytes sent via SQL*Net to client
481 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
0 rows processed
从执行计划中可以看到在访问表MT_PAY_RECORD_FIN时使用的全表扫描,而在表MT_PAY_RECORD_FIN上存在索引PK_MT_PAY_RECORD_FIN(HOSPITAL_ID, SERIAL_NO)为什么没有使用该索引了,查询条件中的谓词条件是b.hospital_id=4307000231而从Predicate Information信息中的4 – filter(TO_NUMBER(“B”.”HOSPITAL_ID”)=4307000231
可知hospital_id在表中是字符型,而在书写查询条件时使用的是数字类型,这里CBO进行数据类型的隐式转换。所以使用不了索引。我们需要写成b.hospital_id=’4307000231′,修改后的SQL如下所示:
select a.hospital_id,
count(distinct a.serial_no) rc,
round(sum(b.real_pay), 2) ylfyze,
round(sum(case
when b.fund_id in ('001') then
b.real_pay
else
0
end),
2) tczc,
round(sum(case
when b.fund_id in ('201') then
b.real_pay
else
0
end),
2) zffy,
round(sum(case
when b.fund_id in ('003', '999') then
b.real_pay
else
0
end),
2) yyzf
from mt_biz_fin a, mt_pay_record_fin b
where a.hospital_id = b.hospital_id
and a.serial_no = b.serial_no
and a.valid_flag = '1'
and b.valid_flag = '1'
and a.biz_type = '12'
and a.pers_type in ('1', '2')
and b.hospital_id='4307000231'
group by a.hospital_id
来真实执行一次,现在能使用索引之后执行时间只要0.1秒
SQL> select a.hospital_id,
2 count(distinct a.serial_no) rc,
3 round(sum(b.real_pay), 2) ylfyze,
4 round(sum(case
5 when b.fund_id in ('001') then
6 b.real_pay
7 else
8 0
9 end),
10 2) tczc,
11 round(sum(case
12 when b.fund_id in ('201') then
13 b.real_pay
14 else
15 0
16 end),
17 2) zffy,
18 round(sum(case
19 when b.fund_id in ('003', '999') then
20 b.real_pay
21 else
22 0
23 end),
24 2) yyzf
25 from mt_biz_fin a, mt_pay_record_fin b
26 where a.hospital_id = b.hospital_id
27 and a.serial_no = b.serial_no
28 and a.valid_flag = '1'
29 and b.valid_flag = '1'
30 and a.biz_type = '12'
31 and a.pers_type in ('1', '2')
32 and b.hospital_id='4307000231'
33 group by a.hospital_id
34 ;
no rows selected
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 3142857175
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 61 | 115 (1)| 00:00:01 |
| 1 | SORT GROUP BY | | 1 | 61 | 115 (1)| 00:00:01 |
|* 2 | TABLE ACCESS BY INDEX ROWID | MT_PAY_RECORD_FIN | 1 | 31 | 1 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 139 | 8479 | 115 (1)| 00:00:01 |
|* 4 | TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN | 139 | 4170 | 87 (2)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | INDI_MT_BIZ_FIN_H_F | 371 | | 19 (6)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | PK_MT_PAY_RECORD_FIN| 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("B"."VALID_FLAG"='1')
4 - filter("A"."VALID_FLAG"='1' AND ("A"."PERS_TYPE"='1' OR
A"."PERS_TYPE"='2'))
5 - access("A"."HOSPITAL_ID"='4307000231')
filter("A"."BIZ_TYPE"='12')
6 - access("B"."HOSPITAL_ID"='4307000231' AND "A"."SERIAL_NO"="B"."SERIAL_NO")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
203 consistent gets
0 physical reads
0 redo size
638 bytes sent via SQL*Net to client
481 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
0 rows processed
从上面的执行计划可以看到现在访问表MT_PAY_RECORD_FIN能正确使用索引PK_MT_PAY_RECORD_FIN,但这里CBO并不是先访问表MT_PAY_RECORD_FIN,这里执行了谓词传递,从Predicate Information 中的 5 – access(“A”.”HOSPITAL_ID”=’4307000231′)可知是先对索引INDI_MT_BIZ_FIN_H_F执行索引范围,但是在查询条件中并没有写a.hospital_id=’4307000231’这个条件,这就是谓词传递的结果,因为有b.hospital_id=’4307000231′ and a.hospital_id=b.hospital_id,所以CBO推导出a.hospital_id=’4307000231’。
在优化这个系统时发现好多类似这两种情况的SQL,都是因为在书写SQL语句时根本就没有注意字段的类型,不同的开发人员书写的SQL语句,有的人谓词数据类型书写正确,有的人谓词数据类型书写不正确。希望开发人员在书写SQL谓词条件时注意数据类型,一定要书写正确。
再次来访,希望每次都有新鲜感。