本案例旨在帮助理解以下要点:
1:索引跳跃扫描的应用:oracle在索引跳跃扫描时会扫描所有分支(目的是对前导列的所有distinct值做了遍历),然后对 branch block 中的每一个值,都会去扫描 leaf block(不管叶子节点是否是空块都会被扫描),适合于复合索引的前导列不同值个数很少的情况,否则其扫描成本非常高。
2:索引失效与表move之间的关系:表在move操作后,表中数据行的rowid发生了改变,index是通过rowid来fetch数据行的,所以表或者相关分区move后对应的索引会失效,但是如果move的表或者分区没有数据,此时相关表的索引或者local索引是不会失效的。
Sql_id: 3565np7ubh7hf
Prasing_schema_name: TBCS
Sql_text:
SELECT REGION, ELEREQ, CTEATEDATE, TASKSOURCE, INVOID
, INVDATE, INVCREATEDATE, NSRSBH, NSRMC, SIGN
, XHFSBH, XHFMC, XHFDZ, XHFDH, GHFMC
, GHFSBH, GHFQYLX, INVOPERID, INVOPERORGID, KPLX
, YFP_DM, YFP_HM, TSCHBZ, CZDM, CHYY
, KPHJJE, NOTE, ORDERID, ROLLORDERID, SENDTYPE
, MAILORTEL, EWM, FP_DM, FP_HM, STATUS
, RETCODE, RETDESP, ISRED, RELATEREQ, LINKREGION
, KPHM, KPZH, FPYWLX, SN, LASTTIME
, DSPTBM, KPXM, XHFYHZH, CHANNEL, GHF_SJ
, SKY, HJBHSJE, HJSE, FHR, RECTYPE
, GHF_DZ, GHF_GDDH, GHF_YHZH, XHF_DW, SUBREGION
, INVINFO
FROM (
SELECT TO_CHAR(A.REGION) AS REGION, TO_CHAR(A.ELEREQ) AS ELEREQ, TO_CHAR(A.CREATEDATE, 'yyyy-mm-dd hh24:mi:ss') AS CTEATEDATE, TO_CHAR(A.TASKSOURCE) AS TASKSOURCE, TO_CHAR(B.OID) AS INVOID
, TO_CHAR(B.INVDATE, 'yyyy-mm-dd hh24:mi:ss') AS INVDATE, TO_CHAR(B.CTEATEDATE, 'yyyy-mm-dd hh24:mi:ss') AS INVCREATEDATE, B.NSRSBH, B.NSRMC, TO_CHAR(B.SIGN) AS SIGN
, B.XHFSBH, B.XHFMC, B.XHFDZ, B.XHFDH, B.GHFMC
, B.GHFSBH, B.GHFQYLX, B.INVOPERID, B.INVOPERORGID, TO_CHAR(B.KPLX) AS KPLX
, TO_CHAR(B.YFP_DM) AS YFP_DM, TO_CHAR(B.YFP_HM) AS YFP_HM, B.TSCHBZ, B.CZDM, B.CHYY
, TO_CHAR(B.KPHJJE) AS KPHJJE, B.NOTE, TO_CHAR(B.ORDERID) AS ORDERID, TO_CHAR(B.ROLLORDERID) AS ROLLORDERID, TO_CHAR(B.SENDTYPE) AS SENDTYPE
, B.MAILORTEL, B.EWM, B.FP_DM, B.FP_HM, TO_CHAR(B.STATUS) AS STATUS
, TO_CHAR(B.RETCODE) AS RETCODE, B.RETDESP, TO_CHAR(B.ISRED) AS ISRED, TO_CHAR(B.RELATEREQ) AS RELATEREQ, TO_CHAR(B.LINKREGION) AS LINKREGION
, TO_CHAR(B.KPHM) AS KPHM, TO_CHAR(B.KPZH) AS KPZH, TO_CHAR(B.FPYWLX) AS FPYWLX, TO_CHAR(B.SN) AS SN, TO_CHAR(B.LASTTIME, 'yyyy-mm-dd hh24:mi:ss') AS LASTTIME
, B.DSPTBM, B.KPXM, B.XHFYHZH, B.CHANNEL, B.GHF_SJ
, B.SKY, TO_CHAR(B.HJBHSJE) AS HJBHSJE, TO_CHAR(B.HJSE) AS HJSE, B.FHR, B.RECTYPE
, B.GHF_DZ, B.GHF_GDDH, B.GHF_YHZH, B.XHF_DW, B.SUBREGION
, B.INVINFO
FROM AR_INV_ELETAXTASK A, AR_INV_ELETAXINVINFO B
WHERE A.ELEREQ = B.ELEREQ
AND A.REGION = :1
AND B.REGION = :2
AND A.STATUS = :3
ORDER BY A.CREATEDATE
) T
WHERE ROWNUM <= :4
Sql说明:驱动表AR_INV_ELETAXTASK中走了索引跳跃扫描,而索引跳跃扫描需要对所有的distinct前导列进行一次遍历.因此索引跳跃扫描比较适合前导列不同值很少的情况。因此判断该执行计划是否合理的标准,可以从前导列的唯一性进行下手。
----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 76 (100)| | | |
| 1 | COUNT STOPKEY | | | | | | | |
| 2 | VIEW | | 3 | 23568 | 76 (2)| 00:00:01 | | |
| 3 | SORT ORDER BY STOPKEY | | 3 | 7236 | 76 (2)| 00:00:01 | | |
| 4 | NESTED LOOPS | | 3 | 7236 | 75 (0)| 00:00:01 | | |
| 5 | NESTED LOOPS | | 3 | 7236 | 75 (0)| 00:00:01 | | |
| 6 | PARTITION LIST SINGLE | | 3 | 87 | 6 (0)| 00:00:01 | KEY | KEY |
| 7 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| AR_INV_ELETAXTASK | 3 | 87 | 6 (0)| 00:00:01 | KEY | KEY |
| 8 | INDEX SKIP SCAN | IDX_ELETAXTASK_REQ | 2 | | 3 (0)| 00:00:01 | KEY | KEY |
| 9 | PARTITION RANGE ALL | | 1 | | 22 (0)| 00:00:01 | 1 | 11 |
| 10 | PARTITION LIST SINGLE | | 1 | | 22 (0)| 00:00:01 | KEY | KEY |
| 11 | INDEX RANGE SCAN | IDX_ELETAXINFO_REQ | 1 | | 22 (0)| 00:00:01 | KEY | KEY |
| 12 | TABLE ACCESS BY LOCAL INDEX ROWID | AR_INV_ELETAXINVINFO | 1 | 2383 | 23 (0)| 00:00:01 | 1 | 1 |
----------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=:4)
3 - filter(ROWNUM<=:4)
7 - filter("A"."STATUS"=TO_NUMBER(:3))
8 - access("A"."REGION"=TO_NUMBER(:1))
11 - access("A"."ELEREQ"="B"."ELEREQ")
Sql执行信息说明:单次执行平均消耗逻辑读67000,返回一行。逻辑读相对很多
BEGIN_TIME INSTANCE_NUMBER MODULE PLAN_HASH_VALUE EXEC PER_GET PER_ROWS TIME_MS PER_READ
------------------- --------------- --------------------------------------------- --------------- ---------- ---------- ---------- ---------- ----------
2017-04-25 16:15:02 1 2052812710 1857 67286 1.208 999.18 0
2017-04-25 16:15:02 1 2569625873 0 0 0 0 0
2017-04-25 16:00:56 1 2052812710 1709 67245 1.154 1002.17 0
2017-04-25 16:00:56 1 2569625873 0 0 0 0 0
2017-04-25 15:45:49 1 2052812710 1887 67252 1.259 996.53 0
……
表统计信息:只有20个数据块,表中数据为0行,IDX_ELETAXTASK_REQ索引为ELEREQ+REGION从业务上了解到前导列字段ELEREQ唯一性很好
Table Number Empty Chain Average Global Sample Date
Name of Rows Blocks Blocks Count Row Len Stats Size MM-DD-YYYY
------------------------------ -------------- --------------- ------------ -------- ------- ------ -------------- ----------
AR_INV_ELETAXTASK 0 0,20 0 0 0 YES 0 04-25-2017
Column Distinct Number Number Sample Date
Name Values Density Buckets Nulls Size MM-DD-YYYY
------------------------------ ------------ ----------- ------- ------------ -------------- ----------
SENDTYPE 0 .00000000 0 0 04-25-2017
MAILORTEL 0 .00000000 0 0 04-25-2017
REGION 0 .00000000 0 0 04-25-2017
ELEREQ 0 .00000000 0 0 04-25-2017
STATUS 0 .00000000 0 0 04-25-2017
CREATEDATE 0 .00000000 0 0 04-25-2017
TASKSOURCE 0 .00000000 0 0 04-25-2017
……
------------------------------ ------------------------------ ---- ------------------------
IDX_ELETAXTASK_REQ ELEREQ 1 NUMBER(20,0) NOT NULL
REGION 2 NUMBER(5,0) NOT NULL
表当前时间段大小为30个数据块,不足1M
sys@CRMDB4>@size
Enter value for segment_name: AR_INV_ELETAXTASK
OWNER SEGMENT_NAME PARTITION_NAME TABLESPACE_NAME SIZE_M BLOCKS
--------------- ----------------------------- ------------------------------ ----------------------------- ------------ ------------
TBCS AR_INV_ELETAXTASK P_L_99 D_CUST_02 0 8
TBCS AR_INV_ELETAXTASK P_L_11 D_CUST_01 0 24
索引段当前信息:索引段大小:532M,数据块约68000个,由于之前存在大量的dml操作,同时该索引长时间没有重建导致存在大量的索引空块,类似表的高水位现象。
sys@CRMDB4>@size
Enter value for segment_name: IDX_ELETAXTASK_REQ
OWNER SEGMENT_NAME PARTITION_NAME TABLESPACE_NAME SIZE_M BLOCKS
--------------- ----------------------------- ------------------------------ ----------------------------- ------------ ------------
TBCS IDX_ELETAXTASK_REQ P_L_99 I_CUST_02 0 8
TBCS IDX_ELETAXTASK_REQ P_L_11 I_CUST_01 532 68,096
索引LAST_DDL_TIME:2016-07-16 19:02:53,表示该索引很久未重建过
SQL> select created ,last_ddl_time from dba_objects where object_name='IDX_ELETAXTASK_REQ';
CREATED LAST_DDL_TIME
------------------- -------------------
2016-07-03 14:01:13 2016-07-16 19:02:53
表AR_INV_ELETAXTASK在2017年2月21日做了move操作,降低高水位。
select ddl_sql,ddl_time from dbmgr.stats$ddl_audit where obj_name='AR_INV_ELETAXTASK';
--------------------------------------------------------------------------------------
alter table AR_INV_ELETAXTASK move partiton P_L_99 PARALLEL 4 2017年2月21日
MOVE表和索引失效的关系:
表在move操作后,表中数据行的rowid发生了改变,index是通过rowid来fetch数据行的,所以表或者相关分区move后对应的索引会失效,但是如果move的表或者分区没有数据,此时相关表的索引或者local索引是不会失效的。
表AR_INV_ELETAXTASK在move后,索引IDX_ELETAXTASK_REQ并没有失效(如果失效就不会走),此时运维人员查询索引发现并没有失效,也就没有对相关索引进行重建,但是这些索引由于之前的DML插入和删除产生了较多的空块,这个类似表的高水位线现象,正是由于move表维护索引没有失效,也就没有重建索引,在进行索引跳跃索引、全索引快速扫描时会把索引中的空块也都进行读取,造成io很高,消耗了很多数据库资源,因此对于这种定期高水位线维护的表也一定要进行索引重建,哪怕这个索引并没有失效。
总结建议:对表AR_INV_ELETAXTASK的索引IDX_ELETAXTASK_REQ进行重建。
索引重建后优化效果:单次执行逻辑读约由67000多降低为42
索引跳跃扫描的优化
扫描方式:oracle在索引跳跃扫描时会扫描所有分支(目的是对前导列的所有distinct值做了遍历),然后对 branch block 中的每一个值,都会去扫描 leaf block(不管叶子节点是否是空块都会被扫描),适合于复合索引的前导列不同值个数很少的情况,否则其扫描成本非常高
索引跳跃扫描的使用场景实验
一、构造测试数据
----------------------------------------------------------
create table test_a as select * from dba_objects
select count(*) from test_a
insert into test_a select * from dba_objects
exec dbms_stats.gather_table_stats(user,'TEST_A')
-------------------------------------------------------
二、构造两个不同的组合索引并使sql语句走索引跳跃扫描
a:索引前导列唯一性很好
create index idex_id_id on test_a(object_id,data_object_id);
alter session set statistics_level=all ;
select /*+ index(d idex_id_id) monitor*/*
from test_a d
where d.data_object_id = 87110;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 5 |00:00:01.15 | 2572 | 2071 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_A | 1 | 5 | 5 |00:00:01.15 | 2572 | 2071 |
|* 2 | INDEX SKIP SCAN | IDEX_ID_ID | 1 | 5 | 5 |00:00:01.15 | 2567 | 2071 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("D"."DATA_OBJECT_ID"=87110)
filter("D"."DATA_OBJECT_ID"=87110)
查询转换等价为:
select /*+ index(d idex_id_id2) monitor*/*
from test_a d
where d.data_object_id = 87110 and object_id='1'
union all
select /*+ index(d idex_id_id2) monitor*/*
from test_a d
where d.data_object_id = 87110 and object_id='2'
union all
......
select /*+ index(d idex_id_id2) monitor*/*
from test_a d
where d.data_object_id = 87110 and object_id='9999'
结论:索引跳跃扫描时把前导列的distinct值全部遍历一次,因此在前导列不同值很多的时候,每一个前导列的分支都要进行遍历,造成io会很高,逻辑读也相应很高。以上测试中逻辑读为2572。
b:索引前导列唯一性不好
create index idex_id_id2 on test_a(status,data_object_id);
alter session set statistics_level=all ;
select /*+ index(d idex_id_id) monitor*/*
from test_a d
where d.data_object_id = 87110;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 5 |00:00:00.05 | 14 | 6 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_A | 1 | 5 | 5 |00:00:00.05 | 14 | 6 |
|* 2 | INDEX SKIP SCAN | IDEX_ID_ID2 | 1 | 5 | 5 |00:00:00.05 | 9 | 6 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("D"."DATA_OBJECT_ID"=87110)
filter("D"."DATA_OBJECT_ID"=87110)
查询转换等价为:
select /*+ index(d idex_id_id2) monitor*/*
from test_a d
where d.data_object_id = 87110 and status='a'
union all
select /*+ index(d idex_id_id2) monitor*/*
from test_a d
where d.data_object_id = 87110 and status='b'
结论:前导列不同值较少,前导列遍历次数很少,逻辑读也相应降低。io相应降低,逻辑读为14.
总结:索引跳跃扫描中oracle 需要对所有的distinct 前导列进行一次遍历,因此跳跃扫描比较适合前导列不同值很少的情况。