最近在优化某系统中发现许多SQL语句在书写谓词条件(wheret条件)时完全不根据表结构定义的字段数据类型来,而是随意书写谓词条件,这样造成原来能走正确索引的结果不能使用该索引,其结果就是查询语句的性能很差,这里将我所遇到的两种情况介绍一下.
第一种情况是谓词条件进行了数据类型的转换转换使得CBO无法使用索引:
其SQL语句如下所示,该SQL的功能是统计一年社保中心一年内由于各种伤害或骨折所发生的医疗费用
select a.hospital_id, c.hospital_name, count(distinct a.serial_no) rc, round(sum(b.real_pay), 2) ylfyze, round(sum(case when b.fund_id in ('001') then b.real_pay else 0 end), 2) tczc, round(sum(case when b.fund_id in ('201') then b.real_pay else 0 end), 2) zffy, round(sum(case when b.fund_id in ('003', '999') then b.real_pay else 0 end), 2) yyzf from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d where a.hospital_id = b.hospital_id and a.serial_no = b.serial_no and a.hospital_id = c.hospital_id and a.fin_disease = d.icd and d.center_id = a.center_id and a.valid_flag = 1 and b.valid_flag = 1 and a.biz_type = 12 and a.pers_type in (1, 2) and (d.disease like '%伤%' or d.disease like '%骨折%') and a.center_id = '430740' and to_char(a.fin_date, 'yyyymmdd') >= '20140101' and to_char(a.fin_date, 'yyyymmdd') < = '20141231' group by a.hospital_id, c.hospital_name order by a.hospital_id
上述SQL执行情况如下,其执行时间为4分40秒
SQL> set timing on SQL> set autotrace traceonly SQL> select c.hospital_id, 2 c.hospital_name, 3 count(distinct a.serial_no) rc, 4 round(sum(b.real_pay), 2) ylfyze, 5 round(sum(case 6 when b.fund_id in ('001') then 7 b.real_pay 8 else 9 0 10 end), 11 2) tczc, 12 round(sum(case 13 when b.fund_id in ('201') then 14 b.real_pay 15 else 16 0 17 end), 18 2) zffy, 19 round(sum(case 20 when b.fund_id in ('003', '999') then 21 b.real_pay 22 else 23 0 24 end), 25 2) yyzf 26 from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d 27 where a.hospital_id = b.hospital_id 28 and a.serial_no = b.serial_no 29 and a.hospital_id = c.hospital_id 30 and a.fin_disease = d.icd 31 and d.center_id = a.center_id 32 and a.valid_flag = 1 33 and b.valid_flag = 1 34 and a.biz_type = 12 35 and a.pers_type in (1, 2) 36 and (d.disease like '%伤%' or d.disease like '%骨折%') 37 and a.center_id = '430740' 38 and to_char(a.fin_date, 'yyyymmdd') >= '20140101' 39 and to_char(a.fin_date, 'yyyymmdd') < = '20141231' 40 group by c.hospital_id, c.hospital_name 41 order by c.hospital_id 42 ; Elapsed: 00:04:39.59 Execution Plan ---------------------------------------------------------- Plan hash value: 1467084556 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 148 | 4254 (20)| 00:00:04 | | 1 | SORT GROUP BY | | 1 | 148 | 4254 (20)| 00:00:04 | |* 2 | TABLE ACCESS BY INDEX ROWID | MT_PAY_RECORD_FIN | 1 | 31 | 1 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 1 | 148 | 4252 (20)| 00:00:04 | | 4 | NESTED LOOPS | | 1 | 117 | 4251 (20)| 00:00:04 | | 5 | NESTED LOOPS | | 3 | 252 | 4250 (20)| 00:00:04 | | 6 | INDEX FULL SCAN | IDX_BS_HOSPITAL_NAME | 1227 | 39264 | 2 (0)| 00:00:01 | |* 7 | TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN | 1 | 52 | 3 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | PK_MT_BIZ_FIN | 1 | | 3 (0)| 00:00:01 | |* 9 | TABLE ACCESS BY INDEX ROWID | BS_DISEASE | 1 | 33 | 1 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | INX_BS_DISEASE_01 | 1 | | 1 (0)| 00:00:01 | |* 11 | INDEX RANGE SCAN | I_MT_PAY_RECORD_FIN_1| 1 | | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(TO_NUMBER("B"."VALID_FLAG")=1) 7 - filter(TO_NUMBER("A"."VALID_FLAG")=1 AND (TO_NUMBER("A"."PERS_TYPE")=1 OR TO_NUMBER("A"."PERS_TYPE")=2)) 8 - access("A"."HOSPITAL_ID"="C"."HOSPITAL_ID" AND "A"."CENTER_ID"='430740') filter("A"."CENTER_ID"='430740' AND TO_NUMBER("A"."BIZ_TYPE")=12 AND TO_CHAR(INTERNAL_FUNCTION("A"."FIN_DATE"),'yyyymmdd')>='20140101' AND TO_CHAR(INTERNAL_FUNCTION("A"."FIN_DATE"),'yyyymmdd')< ='20141231') 9 - filter("D"."DISEASE" LIKE '%伤%' OR "D"."DISEASE" LIKE '%骨折%') 10 - access("D"."CENTER_ID"='430740' AND "A"."FIN_DISEASE"="D"."ICD") 11 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 161233 consistent gets 83048 physical reads 624 redo size 1197 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 5 rows processed
上述SQL对于表BS_HOSPITAL只查询了hospital_name列,而在BS_HOSPITAL表中存在索引IDX_BS_HOSPITAL_NAME(hospital_name,hospital_id)所以首先对IDX_BS_HOSPITAL_NAME索引全扫描这样就不用再回表查询从索引中就是得到hospital_name列的值作为结果集1。再通过对MT_BIZ_FIN表执行索引(PK_MT_BIZ_FIN)范围扫描,再回表查询返回其记录作为结果集2,再以结果集1作为驱动表进行嵌套循环连接。再与表BS_DISEASE,I_MT_PAY_RECORD_FIN_1执行嵌套循环连接,再执行分组排序。其实在MT_BIZ_FIN表中存在复合索引INDI_MT_BIZ_FIN_F_H(FIN_DATE,HOSPITAL_ID,BIZ_TYPE, TREATMENT_TYPE, CENTER_ID),而查询条件中用到了find_date,hospital_id,biz_type,center_id,只是这里因为谓词条件中对于fin_date条件是to_char(a.fin_date, 'yyyymmdd') >= '20140101' and to_char(a.fin_date, 'yyyymmdd') < = '20141231',而fin_date(费用完成时间)是日期类型,这里将find_date转换成字符型所以没有办法使用索引INDI_MT_BIZ_FIN_F_H。 将to_char(a.fin_date, 'yyyymmdd') >= '20140101' and to_char(a.fin_date, 'yyyymmdd') < = '20141231'条件改写成 a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd') ,改写后其SQL语句如下所示:
select c.hospital_id, c.hospital_name, count(distinct a.serial_no) rc, round(sum(b.real_pay), 2) ylfyze, round(sum(case when b.fund_id in ('001') then b.real_pay else 0 end), 2) tczc, round(sum(case when b.fund_id in ('201') then b.real_pay else 0 end), 2) zffy, round(sum(case when b.fund_id in ('003', '999') then b.real_pay else 0 end), 2) yyzf from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d where a.hospital_id = b.hospital_id and a.serial_no = b.serial_no and a.hospital_id = c.hospital_id and a.fin_disease = d.icd and d.center_id = a.center_id and a.valid_flag = 1 and b.valid_flag = 1 and a.biz_type = 12 and a.pers_type in (1, 2) and (d.disease like '%伤%' or d.disease like '%骨折%') and a.center_id = '430740' and a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd') group by c.hospital_id, c.hospital_name order by c.hospital_id
来实际执行一次,其执行结果如下所示,现在执行时间稳定在1-2秒之间,能满足客户要求。
SQL> set autotrace traceonly SQL> select c.hospital_id, 2 c.hospital_name, 3 count(distinct a.serial_no) rc, 4 round(sum(b.real_pay), 2) ylfyze, 5 round(sum(case 6 when b.fund_id in ('001') then 7 b.real_pay 8 else 9 0 10 end), 11 2) tczc, 12 round(sum(case 13 when b.fund_id in ('201') then 14 b.real_pay 15 else 16 0 17 end), 18 2) zffy, 19 round(sum(case 20 when b.fund_id in ('003', '999') then 21 b.real_pay 22 else 23 0 24 end), 25 2) yyzf 26 from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d 27 where a.hospital_id = b.hospital_id 28 and a.serial_no = b.serial_no 29 and a.hospital_id = c.hospital_id 30 and a.fin_disease = d.icd 31 and d.center_id = a.center_id 32 and a.valid_flag = 1 33 and b.valid_flag = 1 34 and a.biz_type = 12 35 and a.pers_type in (1, 2) 36 and (d.disease like '%伤%' or d.disease like '%骨折%') 37 and a.center_id = '430740' 38 and a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd') 39 group by c.hospital_id, c.hospital_name 40 order by c.hospital_id 41 ; Elapsed: 00:00:01.02 Execution Plan ---------------------------------------------------------- Plan hash value: 1467084556 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 17 | 2516 | 1529 (15)| 00:00:02 | | 1 | SORT GROUP BY | | 17 | 2516 | 1529 (15)| 00:00:02 | |* 2 | TABLE ACCESS BY INDEX ROWID | MT_PAY_RECORD_FIN | 1 | 31 | 1 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 17 | 2516 | 1528 (15)| 00:00:02 | | 4 | NESTED LOOPS | | 33 | 3861 | 1521 (15)| 00:00:02 | | 5 | NESTED LOOPS | | 354 | 29736 | 1450 (16)| 00:00:02 | | 6 | INDEX FULL SCAN | IDX_BS_HOSPITAL_NAME | 1227 | 39264 | 2 (0)| 00:00:01 | |* 7 | TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN | 1 | 52 | 1 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | INDI_MT_BIZ_FIN_F_H | 1 | | 1 (0)| 00:00:01 | |* 9 | TABLE ACCESS BY INDEX ROWID | BS_DISEASE | 1 | 33 | 1 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | INX_BS_DISEASE_01 | 1 | | 1 (0)| 00:00:01 | |* 11 | INDEX RANGE SCAN | I_MT_PAY_RECORD_FIN_1| 1 | | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(TO_NUMBER("B"."VALID_FLAG")=1) 7 - filter(TO_NUMBER("A"."VALID_FLAG")=1 AND (TO_NUMBER("A"."PERS_TYPE")=1 OR TO_NUMBER("A"."PERS_TYPE")=2)) 8 - access("A"."HOSPITAL_ID"="C"."HOSPITAL_ID" AND "A"."FIN_DATE">=TO_DATE(' 2014-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "A"."CENTER_ID"='430740' AND "A"."FIN_DATE"< =TO_DATE(' 2014-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) filter("A"."CENTER_ID"='430740' AND TO_NUMBER("A"."BIZ_TYPE")=12) 9 - filter("D"."DISEASE" LIKE '%伤%' OR "D"."DISEASE" LIKE '%骨折%') 10 - access("D"."CENTER_ID"='430740' AND "A"."FIN_DISEASE"="D"."ICD") 11 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO") Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 71411 consistent gets 0 physical reads 0 redo size 1197 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 5 rows processed
第二种谓词条件的数据类型隐式转换无法使用索引的情况,其原始SQL语句如下所示,查询一个医疗机构的费用支出情况
select a.hospital_id, count(distinct a.serial_no) rc, round(sum(b.real_pay), 2) ylfyze, round(sum(case when b.fund_id in ('001') then b.real_pay else 0 end), 2) tczc, round(sum(case when b.fund_id in ('201') then b.real_pay else 0 end), 2) zffy, round(sum(case when b.fund_id in ('003', '999') then b.real_pay else 0 end), 2) yyzf from mt_biz_fin a, mt_pay_record_fin b where a.hospital_id = b.hospital_id and a.serial_no = b.serial_no and a.valid_flag = '1' and b.valid_flag = '1' and a.biz_type = '12' and a.pers_type in ('1', '2') and b.hospital_id=4307000231 group by a.hospital_id
该SQL的执行计划如下所示,执行了1分22秒:
SQL> set autotrace traceonly SQL> select a.hospital_id, 2 count(distinct a.serial_no) rc, 3 round(sum(b.real_pay), 2) ylfyze, 4 round(sum(case 5 when b.fund_id in ('001') then 6 b.real_pay 7 else 8 0 9 end), 10 2) tczc, 11 round(sum(case 12 when b.fund_id in ('201') then 13 b.real_pay 14 else 15 0 16 end), 17 2) zffy, 18 round(sum(case 19 when b.fund_id in ('003', '999') then 20 b.real_pay 21 else 22 0 23 end), 24 2) yyzf 25 from mt_biz_fin a, mt_pay_record_fin b 26 where a.hospital_id = b.hospital_id 27 and a.serial_no = b.serial_no 28 and a.valid_flag = ‘1’ 29 and b.valid_flag = ‘1’ 30 and a.biz_type = ‘12’ 31 and a.pers_type in ('1', '2') 32 and b.hospital_id=4307000231 33 group by a.hospital_id 34 ; no rows selected Elapsed: 00:01:22.20 Execution Plan ---------------------------------------------------------- Plan hash value: 3673479381 -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 61 | 127K (16)| 00:01:56 | | 1 | SORT GROUP BY | | 1 | 61 | 127K (16)| 00:01:56 | |* 2 | TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN | 1 | 30 | 1 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 45 | 2745 | 127K (16)| 00:01:56 | |* 4 | TABLE ACCESS FULL | MT_PAY_RECORD_FIN | 8327 | 252K| 123K (16)| 00:01:53 | |* 5 | INDEX RANGE SCAN | PK_MT_BIZ_FIN | 1 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("A"."BIZ_TYPE"='12’ AND "A"."VALID_FLAG"='1’ AND ("A"."PERS_TYPE"='1’ OR "A"."PERS_TYPE"='2’) 4 - filter(TO_NUMBER("B"."HOSPITAL_ID")=4307000231 AND "B"."VALID_FLAG"='1') 5 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 572386 consistent gets 383935 physical reads 0 redo size 638 bytes sent via SQL*Net to client 481 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 0 rows processed
从执行计划中可以看到在访问表MT_PAY_RECORD_FIN时使用的全表扫描,而在表MT_PAY_RECORD_FIN上存在索引PK_MT_PAY_RECORD_FIN(HOSPITAL_ID, SERIAL_NO)为什么没有使用该索引了,查询条件中的谓词条件是b.hospital_id=4307000231而从Predicate Information信息中的4 – filter(TO_NUMBER(“B”.”HOSPITAL_ID”)=4307000231
可知hospital_id在表中是字符型,而在书写查询条件时使用的是数字类型,这里CBO进行数据类型的隐式转换。所以使用不了索引。我们需要写成b.hospital_id=’4307000231′,修改后的SQL如下所示:
select a.hospital_id, count(distinct a.serial_no) rc, round(sum(b.real_pay), 2) ylfyze, round(sum(case when b.fund_id in ('001') then b.real_pay else 0 end), 2) tczc, round(sum(case when b.fund_id in ('201') then b.real_pay else 0 end), 2) zffy, round(sum(case when b.fund_id in ('003', '999') then b.real_pay else 0 end), 2) yyzf from mt_biz_fin a, mt_pay_record_fin b where a.hospital_id = b.hospital_id and a.serial_no = b.serial_no and a.valid_flag = '1' and b.valid_flag = '1' and a.biz_type = '12' and a.pers_type in ('1', '2') and b.hospital_id='4307000231' group by a.hospital_id
来真实执行一次,现在能使用索引之后执行时间只要0.1秒
SQL> select a.hospital_id, 2 count(distinct a.serial_no) rc, 3 round(sum(b.real_pay), 2) ylfyze, 4 round(sum(case 5 when b.fund_id in ('001') then 6 b.real_pay 7 else 8 0 9 end), 10 2) tczc, 11 round(sum(case 12 when b.fund_id in ('201') then 13 b.real_pay 14 else 15 0 16 end), 17 2) zffy, 18 round(sum(case 19 when b.fund_id in ('003', '999') then 20 b.real_pay 21 else 22 0 23 end), 24 2) yyzf 25 from mt_biz_fin a, mt_pay_record_fin b 26 where a.hospital_id = b.hospital_id 27 and a.serial_no = b.serial_no 28 and a.valid_flag = '1' 29 and b.valid_flag = '1' 30 and a.biz_type = '12' 31 and a.pers_type in ('1', '2') 32 and b.hospital_id='4307000231' 33 group by a.hospital_id 34 ; no rows selected Elapsed: 00:00:00.01 Execution Plan ---------------------------------------------------------- Plan hash value: 3142857175 ------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 61 | 115 (1)| 00:00:01 | | 1 | SORT GROUP BY | | 1 | 61 | 115 (1)| 00:00:01 | |* 2 | TABLE ACCESS BY INDEX ROWID | MT_PAY_RECORD_FIN | 1 | 31 | 1 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 139 | 8479 | 115 (1)| 00:00:01 | |* 4 | TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN | 139 | 4170 | 87 (2)| 00:00:01 | |* 5 | INDEX RANGE SCAN | INDI_MT_BIZ_FIN_H_F | 371 | | 19 (6)| 00:00:01 | |* 6 | INDEX RANGE SCAN | PK_MT_PAY_RECORD_FIN| 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("B"."VALID_FLAG"='1') 4 - filter("A"."VALID_FLAG"='1' AND ("A"."PERS_TYPE"='1' OR A"."PERS_TYPE"='2')) 5 - access("A"."HOSPITAL_ID"='4307000231') filter("A"."BIZ_TYPE"='12') 6 - access("B"."HOSPITAL_ID"='4307000231' AND "A"."SERIAL_NO"="B"."SERIAL_NO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 203 consistent gets 0 physical reads 0 redo size 638 bytes sent via SQL*Net to client 481 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 0 rows processed
从上面的执行计划可以看到现在访问表MT_PAY_RECORD_FIN能正确使用索引PK_MT_PAY_RECORD_FIN,但这里CBO并不是先访问表MT_PAY_RECORD_FIN,这里执行了谓词传递,从Predicate Information 中的 5 – access(“A”.”HOSPITAL_ID”=’4307000231′)可知是先对索引INDI_MT_BIZ_FIN_H_F执行索引范围,但是在查询条件中并没有写a.hospital_id=’4307000231’这个条件,这就是谓词传递的结果,因为有b.hospital_id=’4307000231′ and a.hospital_id=b.hospital_id,所以CBO推导出a.hospital_id=’4307000231’。
在优化这个系统时发现好多类似这两种情况的SQL,都是因为在书写SQL语句时根本就没有注意字段的类型,不同的开发人员书写的SQL语句,有的人谓词数据类型书写正确,有的人谓词数据类型书写不正确。希望开发人员在书写SQL谓词条件时注意数据类型,一定要书写正确。
再次来访,希望每次都有新鲜感。