DG 异常中断

凌晨客户电话说DG中断了,上去看了下MRP0 果然waiting for log,查看error message确认有log没传过来。由于log不多打算从源copy没传过来的日志,解决问题先。结果发现集群进不去。查看集群日志如下:

d/crf/db/ptdb01'.
2019-03-07 22:50:10.635: 
[crflogd(11206894)]CRS-9520:The storage of Grid Infrastructure Management Repository is 99% full. The storage location is '/g01/app/oracle/product/11.2.0/grid/crf/db/ptdb01'.
2019-03-07 22:50:25.513: 
[crflogd(26542192)]CRS-9520:The storage of Grid Infrastructure Management Repository is 99% full. The storage location is '/g01/app/oracle/product/11.2.0/grid/crf/db/ptdb01'.
2019-03-07 22:50:34.580: 
[crflogd(26476864)]CRS-9520:The storage of Grid Infrastructure Management Repository is 99% full. The storage location is '/g01/app/oracle/product/11.2.0/grid/crf/db/ptdb01'.
2019-03-07 23:02:39.839: 
[crflogd(10879168)]CRS-9520:The storage of Grid Infrastructure Management Repository is 99% full. The storage location is '/g01/app/oracle/product/11.2.0/grid/crf/db/ptdb01'.

清理监听日志释放空间后,集群正常,DG自动追上了归档。
ptdb01:/g01/app/oracle/product/11.2.0/grid/log/diag/tnslsnr/ptdb01/listener_scan1/alert$df -g
Filesystem GB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 35.00 27.99 21% 8843 1% /
/dev/hd2 10.00 7.40 27% 55014 4% /usr
/dev/hd9var 10.00 4.41 56% 9055 1% /var
/dev/hd3 25.00 7.49 71% 14698 1% /tmp
/dev/hd1 1.00 0.45 56% 2635 3% /home
/dev/hd11admin 0.50 0.50 1% 5 1% /admin
/proc – – – – – /proc
/dev/hd10opt 5.00 4.64 8% 10846 1% /opt
/dev/livedump 0.50 0.50 1% 4 1% /var/adm/ras/livedump
/dev/lv00 0.50 0.48 4% 18 1% /var/adm/csd
/dev/g01lv 20.00 0.94 96% 62696 18% /g01
/dev/u01lv 30.00 9.38 69% 903801 25% /u01

OGG-01296 处理

ogg repa停了

view report xxxx结果如下

                          : [/lib64/libc.so.6(__libc_start_main+0xfd) [0x3a2b61ed1d]]
                          : [/home/ogg/replicat(__gxx_personality_v0+0x2fa) [0x4bbd6a]]

2019-03-03 22:22:44  ERROR   OGG-01296  Error mapping from DAOXUN.GDN to DAOXUN.GDN.

***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************

再去观察discard file,可发现是由于触发器导致的故障。

Current time: 2019-03-03 22:22:44
Discarded record from action ABEND on error 20001

OCI Error ORA-20001: 退货不允许修改
ORA-06512: at "DAOXUN.GDN_PG", line 33
ORA-04088: error during execution of trigger 'DAOXUN.GDN_PG' (status = 20001). UPDATE "DAOXUN"."GDN" SET "DELIV_MODE" = :a2,"FSCL_DELIV_MODE" = :a3,"RCV_UNIT_ID" = :a4,"RCV_WAREH_ID" = :a5,"RCV_FSCL_UNIT_ID" = :a6,"END_UNIT_ID" = :a7,"END_WAREH_ID" = :a8,"DELIV_MTHD" = :a9,"DELIV_PSTD" = :a10,"DELIV_ADDR" = :a11,"PICK_REQD" = :a12,"RCK_REQD" = :a13,"VEH_REQD" = :a14,"VWC_NUM" = :a15,"CNSN_REQD" = :a16,"CSB_NUM" = :a17,"FTCHR_ID" = :a18,"FTCH_ST_TIME" = :a19,"FTCH_FIN_TIME" = :a20,"DLVR_ID" = :a21,"DELIV_TIME" = :a22,"CNGNR_ID" = :a23,"CNSN_TIME" = :a24,"PROGRESS" = :a25,"DELIV_PVN" = :a26,"DELIV_CTY" = :a27,"DELIV_DIST" = :a28,"RCVR_NAME" = :a29,"RCVR_TEL_NUM" = :a30 WHERE "UNIT_ID" = :b0 AND "GDN_NUM" = :b1
Aborting transaction on /home/ogg/dirdat/pa beginning at seqno 1 rba 77334039
                         error at seqno 1 rba 77334039
Problem replicating DAOXUN.GDN to DAOXUN.GDN
Mapping problem with compressed update record (target format)...

于是禁用 相关触发器。

set linesize 1000
set pages 0
select 'alter trigger '||owner||'.'||trigger_name||' disable;'from dba_triggers t where owner='DAOXUN' and t.status!='DISABLED';

另外参考如下mos 排查时考虑主键相关索引 排查sql如下。

select OBJECT_NAME,status ,''from dba_objects where object_name in (select index_name from dba_indexes where table_name ='GDN' and table_owner ='DAOXUN' ) and owner = 'DAOXUN'
union all
select  CONSTRAINT_NAME,status,validated from  dba_constraints where TABLE_NAME = 'GDN' and owner = 'DAOXUN';

ORA-39041

如下:
impdp goldengate/ directory=EXPDP network_link=xxxx_LINK logfile=daoxun.log parallel=8 schemas=DAOXUN EXCLUDE=TABLE:”IN (‘xxxx’,’yyy’)” FLASHBACK_SCN=169440316726

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORA-39001: invalid argument value
ORA-39041: Filter “EXCLUDE” either identifies all object types or no object types.

结果,把参数放到parfile里就可以执行了

oracle@pos:/home/oracle>impdp goldengate/123456 parfile=p1

Import: Release 11.2.0.4.0 – Production on Wed Feb 27 18:34:38 2019

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
FLASHBACK automatically enabled to preserve database integrity.
Starting “GOLDENGATE”.”SYS_IMPORT_SCHEMA_01″: goldengate/******** parfile=p1
Estimate in progress using BLOCKS method…
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 970.4 GB

从一个SAP的优化说起

从一个SAP的优化说起

背景

某客户一个SAP月结CO模块出现问题,通过监控已经定位到了sql,但客户dba尝试优化了数次依然没好的效果(对于含in的SQL使用了use_concat hint),客户提了SR给SAP,SAP回复加上index hint,但是对执行计划依然没有影响,SQL性能问题拖了两天,客户DBA压力很大。虽然只是其中一个环节的SQL,但由于该sql 性能低下,导致工厂计算成本计算不出来,而且月底用其他时候都不用,只在二月使用,所以对比该sql历史执行计划。

##问题初探
sql 本身其实很简单,只涉及三张表的连接,文本如下。

SELECT /*+ USE_CONCAT ORDERED USE_NL(T_00 T_01 T_02)
INDEX("T_00" "AFKO~5" "AFKO^5" "AFKO______5" "AFKO______5__X" "AFKO___5")
*/ T_01."MANDT", T_01."AUFNR", T_02."POSNR", T_01."OBJNR", T_01."AUTYP"
    , T_01."KOKRS", T_01."WERKS", T_02."MATNR", T_00."PLNBEZ", T_02."SAFNR"
    , T_02."VERID", T_02."OBJNP", T_02."PROJN", T_02."PWERK", T_02."SOBKZ"
    , T_02."KZBWS", T_01."LOEKZ", T_01."PKOSA", T_01."AWSLS", T_01."PROCNR"
    , T_01."FLG_MLTPS", T_00."PLNAL", T_00."PLNNR", T_00."MAUFNR", T_00."PRODNET"
    , T_00."COLORDPROC"
FROM "AFKO" T_00
INNER JOIN "AUFK" T_01 ON T_01."MANDT" = :A0
AND T_00."AUFNR" = T_01."AUFNR" 
    LEFT JOIN "AFPO" T_02 ON T_02."MANDT" = :A1
AND T_00."AUFNR" = T_02."AUFNR" 
WHERE T_00."MANDT" = :A2
    AND T_00."MAUFNR" IN (:A3, :A4, :A5, :A6, :A7)
    AND T_01."LOEKZ" = :A8
    AND T_01."PKOSA" = :A9

执行计划如下:
[]

表的统计信息相对比较准确,执行计划中的性能瓶颈点很好定位,即红线划出的部分,在访问索引的时候,access方式访问了第一列的数据,然后获取全量数据再过滤。比较明显第一列过滤后还有大量数据,当前执行计划中索引这种方式访问效率很低。问题比较明显,但解决起来方案没那么明显。这是个看似眼熟的问题。通常遇到in,or 之类,CBO没有展开的情况下,是类似全表扫描带filter 过滤,通常方案是加上use_concat的hint,配合合适的索引,就能解决问题.但这里,问题显然是个变异版本。那么接下来怎么处理呢。

##处理思路
其实问题分析到这里,接来的思路其实是比较多,简单列举下,如:

  • 排查索引聚促因子。看似合理,但一般用于走表不走索引的排查;
  • 进一步排查表,索引相关统计信息。重点可能需要排查直方图统计信息,或多列统计信息;
  • 改写SQL,写法上把in 展开成or,但这里sql本身不是特别复杂的sql,其实改写有点怪怪的,特别在应用是SAP的背景下,改写了也不可能上线,所以不太合适做解决方案。但尝试改写了分析对比执行计划的outline部分,也是不错的思路。
  • 通过10053事件生成trace,分析这个异常执行计划的成因,再进一步做分析。这个适合没明显思路,难以理解CBO行为的时候,排查CBO的选择依据;

##冷门hint
这里其实涉及到一个超低使用频率的冷门的hint:num_index_keys.如果客户在查看执行计划的时候注意观察下outline部分,就能看到这个hint。 关于这个hint 在oracle社区中也有过讨论,当时是有人生产环境升级到12c后有大量sql执行计划变更,出现了大量的filter 执行计划(并非bug)。参考"Access" and "Filter" changes when upgraded to 12c 这个讨论 https://community.oracle.com/thread/4184148.以及jonathanlewis的博客中也提到了这个hint。

这个hint 指示优化器在进行“INLIST ITERATOR”操作时,可以使用多少个索引键来访问索引。下图为改后的hint,也就是将最末的位置从1改成了2.
[]
改后的效果:直接访问前两列,四次扫描,不用获取第二列的全量数据,性能上大幅提升。
[]

##总结
对于数据的访问大部分情况下出现filter计划都是不好的,对于索引稍稍有点不同。INLIST ITERATOR 访问索引时,如果access完成后需要filter时的数据量很少,可能filter更好,而如果需要filter大量数据后,最终量比较少(也就条件的过滤性比较好,选择性比较高),则适合使用access继续访问接下来的数据。使用oracle 社区的sql 改写了以下案例:

###数据初始化
t1表中c1列只有两个值’0001′,’0002′。’0001′的选择性非常好,只需要通过c1=’0001′ 就能过滤出仅有的三条数据。

create table t1 (
    c1 varchar2(4) not null,
    c2 varchar2(1) not null,
    c3 varchar2(15) not null,
    c4 varchar2(4)  not null,
    v1 varchar2(250));

insert into t1  values ('0001','H','18110034450001',' ',' ');
insert into t1  values  ('0001','H','18110034450001',' ',' ');
insert into t1  values  ('0001','H','18110034450001',' ',' ');

insert into t1 select '0002', decode(mod(level,3),0,'F',1,'H',2,'C'), '18110034450001',' ',' ' from dual connect by   level <= 1e5;

commit;

create index idx1 on t1(c1, c2, c3);

exec dbms_stats.gather_table_stats(null,'t1',method_opt => 'for all columns size auto',cascade => true);


alter session set statistics_level = all;
select
    t1.*
FROM    t1
WHERE
    t1.c1 = '0001' 
AND t1.c2 in ('H', 'F', 'C')
AND t1.c3 in (
        '18110034450001',
        '18110034450101',
        '18110034450201',
        '18110034450301',
        '18110034450401',
        '18110034450501'
    )
;

###对比查询

数据库默认的选择就是对IDX1索引访问时使用access c1,c2列,filter c3列。这里需要重点关注的有两个点:buffers 11,以及 INDEX RANGE SCAN访问索引IDX1时start了3次。

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  gq5h3y8nnpd6t, child number 0
-------------------------------------
select     t1.* FROM    t1 WHERE     t1.c1 = '0001' AND t1.c2 in ('H',
'F', 'C') AND t1.c3 in (     '18110034450001',
'18110034450101',     '18110034450201',     '18110034450301',
      '18110034450401',     '18110034450501'     )

Plan hash value: 411668446

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |     6 (100)|      3 |00:00:00.01 |  11 |
|   1 |  INLIST ITERATOR             |      |      1 |        |        |      3 |00:00:00.01 |  11 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |      3 |      3 |     6   (0)|      3 |00:00:00.01 |  11 |
|*  3 |    INDEX RANGE SCAN          | IDX1 |      3 |      3 |     5   (0)|      3 |00:00:00.01 |   9 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."C1"='0001' AND (("T1"."C2"='C' OR "T1"."C2"='F' OR "T1"."C2"='H')))

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------
       filter(("T1"."C3"='18110034450001' OR "T1"."C3"='18110034450101' OR "T1"."C3"='18110034450201' OR
          "T1"."C3"='18110034450301' OR "T1"."C3"='18110034450401' OR "T1"."C3"='18110034450501'))


25 rows selected.

通过num_index_keys强制对IDX1索引的访问全部使用access。可以发现逻辑读 buffers 上升到了17 ,以及 INDEX RANGE SCAN访问索引IDX1时start了18次。

SQL> SQL>
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2zmj577kn9j65, child number 0
-------------------------------------
select         /*+  num_index_keys(@sel$1 t1@sel$1 idx1
3) */     t1.* FROM    t1 WHERE     t1.c1 = '0001' AND t1.c2 in ('H',
'F', 'C') AND t1.c3 in (     '18110034450001',
'18110034450101',     '18110034450201',     '18110034450301',
      '18110034450401',     '18110034450501'     )

Plan hash value: 411668446

--------------------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |    20 (100)|      3 |00:00:00.01 |  17 |
|   1 |  INLIST ITERATOR             |      |      1 |        |        |      3 |00:00:00.01 |  17 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |     18 |      3 |    20   (0)|      3 |00:00:00.01 |  17 |
|*  3 |    INDEX RANGE SCAN          | IDX1 |     18 |      3 |    19   (0)|      3 |00:00:00.01 |  15 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------
   3 - access("T1"."C1"='0001' AND (("T1"."C2"='C' OR "T1"."C2"='F' OR "T1"."C2"='H')) AND
          (("T1"."C3"='18110034450001' OR "T1"."C3"='18110034450101' OR "T1"."C3"='18110034450201' OR
          "T1"."C3"='18110034450301' OR "T1"."C3"='18110034450401' OR "T1"."C3"='18110034450501')))


26 rows selected.

综上,对于索引的filter访问,如果作为SQL审核的考核点的话,有filter 执行计划可被认为是有高风险的,(尤其是执行频率高的sql,access跟filter的执行计划cost差不多的情况)但依然需要具体场景进行人工验证。

一次收益过高的优化

##背景
春节前某天下午4点的样子客户打来电话,说近期的某个关键系统经常HIGH CPU,业务频频反馈说数据库运行极慢,CPU使用率频频100%,而且主机无法ssh登录,年底业务高峰期,系统性能故障对业务影响极大。甲方DBA已经自己优化了差不多两天了,效果不明显,需要到场优化。过来之前问客户了解了些大概信息,这个库历史上CPU使用率并不高,年底业务高峰期加上可能要拉报表所以负载会高一些。主机为AIX系统,虚拟化后CPU有48个,内存96G.单机的数据库,版本为10.2.0.5的。

##应急处理
到场后情况不乐观,客户那边忙成一团,监控中看了下历史CPU情况,如下
[]
刚准备上机操作,发现数据库主机又开始又登录不进去了,客户DBA尝试了几分钟后,终于登录进去,一查果然CPU100%了,为了应急保证业务,临时决定先杀一批没有事务的会话.HIGH CPU的时候观察了下,主机层面 user 大约92%,sys的约7,几乎没有io wait的CPU,也没有页交换出现,CPU消耗高的进程基本上都是oracle用户的。

begin
for i in (select 'ALTER SYSTEM KILL SESSION '''||sid||','||serial#||'''' t from (SELECT distinct t1.sid, t1.serial#, t1.username, t1.osuser FROM v$session t1, v$transaction t2 where t1.saddr <> t2.ses_addr   and t1.username not in('SYS')) ) loop
    begin
         EXECUTE IMMEDIATE i.t;
    exception when others then
       DBMS_OUTPUT.PUT_LINE(SUBSTR(SQLERRM,1,200));
    end;
end loop;
end;
/

反复运行以上命令后,CPU使用率勉强稳定到了90%,基本上可以开始分析性能故障了。

##AWR报告信息
AWR报告开始生成不出来,一些常规操作后,拿到了故障时间段的AWR报告。先看下大概的信息,采样的一个半小时内数据库AAS(平均活跃会话数)已达84,redo 量比较大,硬解析,登录,排序等指标没严重的问题。
[]

其实大部分情况到这步了很多DBA基本上就直接搜索SQL ordered by CPU Time 或是 SQL ordered by Gets去定位TOP SQL了。
[]

那么这里也贴下对应的部分,并没有明显的topsql,甚至top1也就占1.6%。那么这时候怎么办呢?
##是硬解析导致topsql不明显么
通过 force_matching_signature 这个标志位可以去判断下是否其实有那种高占比的TOPSQL由于应用没有使用绑定变量而被掩盖,由于监控中带功能识别这种问题,我没写sql去判断,通过监控的识别快速打破了这种美好的愿景,哪怕强制绑定变量后TOP1 的SQL也数据库时间占比也不超过3%,top 10加起来目测也不超过20%。显然找到TOPSQL再实施优化,目前看起来行不通,毕竟找不到TOPSQL。

##换个角度能否突破
其实在之前的文章中也提到过,SQL层面的定位难以突破时,可能需要聚合到对象层面(一般涉及访问路径优化),也就是对象层面。通过搜索 Segments by Logical Reads 可以发现上帝在为你关上一扇的门的时候,顺便也把窗户关上了……没错,聚合到了对象层面,top1只占16%,这几乎就说明想优化两三个个SQL就解决问题的想法不现实。
[]

##真相其实…
其实现场,我并没有太多的困惑,毕竟写文章需要,上面的思路基本是按照客户DBA 处理的思路延伸了下。真相是上来我就已经看过了以下信息,基本换了大方向了。在看了这个信息后大家有别的思路了么。
[]

没错,top 5等待事件 比较有趣。正常CPU-bound的数据库系统,latch,mutex相关的等待伴随也比较多,会出现在top5/10里,而这个100%CPU的系统竟然是top5 等待事件却是User I/O类,常规指标User I/Odb file scattered read的平均IO延迟是2ms,似乎IO没成为瓶颈,与之对应的是开始主机层面cpu的iowait很少。

虽然看似IO也有一定消耗的,但并不合适直接去看SQL ordered by Reads再深入去看TOPSQL,毕竟要解决的问题是系统100%CPU。当然这里也贴图看下
[]

依然没有大的topsql。

##另几个坑?
在看到了top5的等待事件后,我马上搜索了下log file 出现了如下信息log file sync 39ms log file parallel write 也达到了20ms。
[]

10g 没 redo synch poll writes,不用检查_use_adaptive_log_file_sync参数。看到这里是否感觉可能问题其实是出在存储上呢?常见的存储IO性能故障引起日志同步变慢,AAS大幅提升最终引发故障的场景屡见不鲜。一般来说从数据库层面会以后台等待事件log file parallel write的延迟来衡量存储顺序写的性能,那是否我们应该把关注重心转移到存储IO性能上呢。其实以个人的优化经验看来,log file parallel write的高延迟最终能确认是存储问题比例其实不高,刚好客户这边XIV的软件不知是有异常还是什么原因,查不到过去一小时的IO写延迟。关于log file parallel write ,虽然从事件的定义上它可作为衡量存储性能的重要指标,但由于事件的登记都是由LGWR进程去做的,也就是受到进程调度的影响,在CPU使用率高的时候(或是cpu_count超过128的时候),这个指标并其实不准确,更多信息可以去参考oracle support的文档(34583.1)。
是继续纠结在IO性能这里呢,还是按照客户DBA的建议优化那些他们抓出来的几页长的报表SQL呢?客户的CTO在线等你给建议。存储这边也可以让存储维保厂商继续过来确认存储性能问题,哪怕是加内存,进而调整SGA,甚至是keep一些关键表都是可操作的。而报表SQL语句这边呢,要改写,要优化,要协调开发,也可以商量,毕竟业务影响实在太大。但有一点,在做之前需要明确的就是,这些接来下打算做的操作能否将CPU降低一个合理的比例保障业务的正常运行,这是CTO肯定会问的问题,白费力气忙活一场,没解决问题那肯定很尴尬的。

##决策
不知不觉,时间已经到了快下班的时候,客户CTO表示今天晚上做些操作,希望明天能看到效果。这时候如果是你接下来打算怎么处理呢,以及给客户一个降低CPU多少的期望呢。
我当时给出的保守期望是降低20%,不改参数,只是在三张表上建立四条索引。哪三张表呢,参考下图前三张表

[]
四条索引的依据则三张表的数据访问路径的聚合,其中一条对应的SQL当天运行情况如下。
[]

可以发现这种SQL 瞬间的并发非常高,ASS能达到40+,但过了峰值可能又是别的SQL上来,并且这种sql通过force_matching_signature聚合后,一个小时的采样数据都进步了前20,这种SQL在awr也不一定能进AWR各个维度的top10 sql,这类SQL的发现不太适合使用AWR报告,当然定位了top segment后,通过表名的关键字两边加上空格,也可以尝试下搜索相关sql。如果手工写脚本分析建议聚合分析dba_hist_active_sess_history,dba_hist_sqlstat 等视图。
本次优化最终选择的是优化SQL的方式处理,而且是优化物理读消耗多的sql入手,原因是对系统故障的猜测:部分高IO消耗的SQL,影响了redo的刷新性能(自动巡检报告中,系统redo未隔离数据文件),系统并发性上去后,CPU消耗过高,LGWR进程CPU调度不能保证(巡检报告中,常见参数_high_priority_processes没有设置LGWR),又进一步加剧DML慢的问题,系统并发从而更高,不断恶化。_high_priority_processes这个参数并未修改,因为该参数修改需要重启数据库,而通过临时处理时renice 可以达到相似效果,而不用重启。所以我计划第二天现场发现 CPU高过80%时,再手工调整。这样也方便自己对单个调整能取得的收益有个更好的认识。

##开奖
第二天的情况出乎意料的好,几乎腰斩,CPU偶尔达到60%,以至于根本不用renice操作,我都忍不住去群里确认了下当天的业务确实没有任何变更。做了份awr报告,我关心的log file synclog file parallel write如图 勉强还行,但跟昨天比好太多了。
[]

系统AAS也降到了30左右。
[]

##总结
这次故障处理的过程反思可以发现,其实不少的隐患,在日常的巡检,sql审核中都可以被发现,但到了故障现场,可能由于环境的复杂,被放大后出很多衍生故障,分析定位起来则麻烦不少。日常的巡检,SQL审核多花些精力,定期更新自己的知识库,这些时间最终都是高回报的。

  • 100% HIGH CPU 风险
    100%CPU使用率的风险是大部分系统都存在的,最难受的是发生了HIGH CPU后,可能主机SSH都无法访问。这种风险建议在CPU_COUNT层面做调整, 该参数默认值是使用全部的CPU资源,考虑对其进行调整为逻辑CPU -2(单机数据库)。RAC架构下同一集群有多个数据库的结合具体情况考虑实例CPU隔离,参考support文档(1362445.1)。resource manager 在处理整体使用率上不太好用,业务上AP,TP混合的话,可以通过RM调整业务的优先级,保障重点业务。报表拖垮了交易系统的问题,可以通过RM来实施保障。

  • LGWR日志同步性能隐患
    高CPU使用率(一般超过80%)或cpu个数过多情况下,LGWR性能可能受CPU调度影响较大,建议_high_priority_processes 设置LGWR。该参数调整需要重启数据库,建议规划后集中调整。

View 指定索引hint

按照 index( viewname tablename indexname) 这种方式 指明,如果view 创建时定义了别名,指定时依然使用表名,非别名,如果是view 在查询中用了别名,viewname 这里使用别名。

create table ttt1 as select * from dba_objects;
create index idx11 on ttt1(object_id);
create view vt as select * from ttt1;
create view vt2 as select * from ttt1 t;

TEST@orcl1>select * from vt where object_id<>10;

89592 rows selected.

Execution Plan

Plan hash value: 1749644783


| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

| 0 | SELECT STATEMENT | | 98412 | 19M| 10292 (1)| 00:00:11 |

|* 1 | TABLE ACCESS FULL| TTT1 | 98412 | 19M| 10292 (1)| 00:00:11 |

Predicate Information (identified by operation id):

1 – filter(“OBJECT_ID”<>10)

Note

  • dynamic sampling used for this statement (level=2)

Statistics

 12  recursive calls
  0  db block gets
   7251  consistent gets
   1609  physical reads
  0  redo size
4700257  bytes sent via SQL*Net to client
  66211  bytes received via SQL*Net from client
   5974  SQL*Net roundtrips to/from client
  0  sorts (memory)
  0  sorts (disk)
  89592  rows processed

TEST@orcl1>select /+ index(tt ttt1 idx11) */ from vt tt where object_id<>10;

89592 rows selected.

Execution Plan

Plan hash value: 277384511

——————————————————————————–

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|

——————————————————————————–

| 0 | SELECT STATEMENT | | 98412 | 19M| 304 (4)| 00:00
:01 |

| 1 | TABLE ACCESS BY INDEX ROWID| TTT1 | 98412 | 19M| 304 (4)| 00:00
:01 |

|* 2 | INDEX FULL SCAN | IDX11 | 4921 | | 224 (5)| 00:00
:01 |

——————————————————————————–

Predicate Information (identified by operation id):

2 – filter(“OBJECT_ID”<>10)

Note

  • dynamic sampling used for this statement (level=2)

Statistics

 11  recursive calls
  0  db block gets
  13674  consistent gets
   1484  physical reads
  0  redo size
4690743  bytes sent via SQL*Net to client
  66211  bytes received via SQL*Net from client
   5974  SQL*Net roundtrips to/from client
  0  sorts (memory)
  0  sorts (disk)
  89592  rows processed

TEST@orcl1>select /+ index(tt t idx11) */ from vt2 tt where object_id<>10;

89592 rows selected.

Execution Plan

Plan hash value: 1749644783


| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

| 0 | SELECT STATEMENT | | 98412 | 19M| 10292 (1)| 00:00:11 |

|* 1 | TABLE ACCESS FULL| TTT1 | 98412 | 19M| 10292 (1)| 00:00:11 |

Predicate Information (identified by operation id):

1 – filter(“OBJECT_ID”<>10)

Note

  • dynamic sampling used for this statement (level=2)

Statistics

 11  recursive calls
  0  db block gets
   7251  consistent gets
   1609  physical reads
  0  redo size
4700257  bytes sent via SQL*Net to client
  66211  bytes received via SQL*Net from client
   5974  SQL*Net roundtrips to/from client
  0  sorts (memory)
  0  sorts (disk)
  89592  rows processed

latch free 处理

历史定位latch free:

set linesize 1000
col P1   format 9999999999999999999999999999

select  distinct event, P1, P2, P3 from
dba_hist_active_sess_history  where
event='latch free' and
sample_time between
                to_timestamp('2018.10.22 10:05:48', 'YYYY.MM.DD HH24:MI:SS') and
                to_timestamp('2018.10.22 11:15:00', 'YYYY.MM.DD HH24:MI:SS');
  • 使用P1参数查询(这里是14)
    SELECT *
    FROM v$latchname
    WHERE latch# = 14;

如果再故障现场 可以使用以下SQL查询,事后比较难以追溯持有的源头。

select c.name, b.pid
from v$latch a, v$latchholder b, v$latchname c
where a.addr=b.laddr(+)
and a.latch#=c.latch#
/

唯一索引 优化相关

以下脚本测试 唯一索引相对普通索引可减少一次逻辑读:

create table t2 as select object_id,object_name from dba_objects where rownum<=10;
create index t_idx2 on t2(object_id);
alter session set statistics_level=all;
set autot on
select /+ full(t2) */ * from t2 where object_id=58260;
select /
+ index(t2 t_idx2) / * from t2 where object_id=58260;
drop index t_idx2;
create unique index t_idx12 on t2(object_id);
select /
+ index(t2 t_idx12) / * from t2 where object_id=58260;
select /
+ index(t2 t_idx12) */ * from t2 where object_id=58260 and rownum=1;

以下脚本测试非唯一索引加上rownum=1 后变相起到相同作用:
create table t as select object_id,object_name,1 a from dba_objects where rownum<=10;
create index t_idx on t(object_id,a,object_name);
alter session set statistics_level=all;
set autot on
select /+ full(t) */ * from t where object_id=58260;
select /
+ index(t t_idx) / * from t where object_id=58260;
drop index t_idx;
create unique index t_idx1 on t(object_id);
create index t_idx2 on t(object_id,a,object_name);
select /
+ index(t t_idx1) / * from t where object_id=58260;
select /
+ index(t t_idx2) */ * from t where object_id=58260 and rownum=1;

普通表转换分区表最佳实践方案

#普通表转换分区表最佳实践方案

在生产库运维中频频发生的就是随着数据量的增长,数据库产生了大表,而这些大表并未做过分区,生产的SQL也时长会出现需要查询相当数据量的数据而必须走全表的情况,sql性能下降非常明显。将普通表转换成为分区表是一个非常常见的需求。

##需求背后的实际环境
转换分区表在实际生产环境中有着诸多的限制条件:

  • 数据量特别大(可能数亿,数十亿以上)
  • 表上没有主键
  • 复杂的触发器
  • 复杂的外键关联关系
  • 海量的存储过程关联
  • SQL中频频使用hint 指定索引名称(变更后索引,约束名称尽量不变)
  • 存在大量物化视图

#常用转换分区表方式
实际生产中使用最多的为以下两种方式:

  • CTAS+rename 方式
  • 在线重定义

CTAS+rename方式需要处理的细节更多,我们有脚本可以简化这些细节,但在遇到有物化视图的场景时,脚本处理的尚不完善。通过在线重定义会简单很多,同时在线重定义可以不停业务完成相应操作,也有一定优势。

#建立测试数据
建立具有外键关联关系,Lob字段的为分区表,并为之建立关联存储过程,触发器,视图,物化视图。

select name,model,STYLESHEET,row_number()  num from SYS.METASTYLESHEET;

drop user test2 cascade;
create user test2 identified by test;
grant dba to test2;
conn test2/test;

create table unpartTable as select name,model,STYLESHEET,row_number() over(partition by model order by name) num from SYS.METASTYLESHEET;


create table P1 as select name from unpartTable;
create table C1 as select num from unpartTable;

create unique index pkidx on unpartTable(num);
create unique index idx2 on unpartTable(name);
alter table unpartTable add constraint pk primary key(num) using index pkidx;

create unique index pkidx1 on P1(name);
alter table P1 add constraint pk1 primary key(name) using index pkidx1;

create unique index pkidx2 on C1(num);
alter table C1 add constraint pk2 primary key(num) using index pkidx2;

alter table unpartTable add foreign key(name) references P1;
alter table C1 add foreign key(num) references unpartTable;

EXEC DBMS_STATS.gather_table_stats('test1', 'unpartTable', cascade => TRUE);

CREATE OR REPLACE PROCEDURE testP AS
NUM PLS_INTEGER:=1000;
BEGIN
select count(*)  into num from unpartTable;
EXCEPTION
  WHEN OTHERS THEN
    DBMS_OUTPUT.PUT_LINE('xxxx');
    ROLLBACK;
END;
/

CREATE OR REPLACE TRIGGER testT 
   BEFORE DELETE ON unpartTable 
   BEGIN
      Null;
   END;
/


alter table unpartTable add constraint testC check (num is not null);
create view V_unpartTable as select * from unpartTable;


create materialized view log on unpartTable  tablespace users with primary key; 

CREATE MATERIALIZED VIEW mv_unpartTable REFRESH FAST START WITH SYSDATE
NEXT  SYSDATE + 1/48
WITH PRIMARY KEY
AS SELECT * FROM unpartTable;

#CTAS+Rename方式

使用以下脚本

set serveroutput on
set linesize 1000
set long 99999
declare
owner1 varchar2(200);
table_name1  varchar2(200);
temp_1 varchar2(8000);
temp_2 varchar2(8000);
tmpsql varchar(8000);
cursor c1 is select 'SELECT dbms_metadata.get_ddl(''INDEX'','''||index_name||''','''||owner1||''') FROM dual' txt from (select owner1,index_name from dba_indexes where table_name = table_name1 and owner =owner1 and index_type <> 'LOB');

cursor c2 is select 'SELECT dbms_metadata.get_ddl(''TABLE'','''||table_name1||''','''||owner1||''') FROM dual' txt from dual;

cursor c3 is select 'SELECT dbms_metadata.get_dependent_ddl(''REF_CONSTRAINT'','''||a.table_name ||''','''||a.owner||''') FROM dual' txt,a.constraint_name ,a.owner,a.table_name from  dba_constraints a, dba_constraints b where a.constraint_type = 'R' and b.constraint_type = 'P' and a.r_constraint_name = b.constraint_name and b.table_name =table_name1 and b.owner = owner1;

cursor c4 is select 'SELECT dbms_metadata.get_ddl(''CONSTRAINT'','''||constraint_name ||''','''||owner||''') FROM dual' txt, constraint_name , owner, table_name from   dba_constraints b where  b.table_name =table_name1 and b.owner = owner1 and constraint_type <> 'R';

begin
owner1 := upper('TEST2');
table_name1 := upper('unpartTable');

dbms_output.put_line('set linesize 1000');
dbms_output.put_line('set long 99999');
dbms_output.put_line('set pages 0');
temp_1 := 'select /*+ parallel(16) */ count(*) from '||owner1 || '.' ||table_name1 ||';';
dbms_output.put_line(temp_1);


for x in (select index_name from dba_indexes where table_name = table_name1 and owner =owner1 and index_type <> 'LOB') loop
dbms_output.put_line('alter index '|| owner1 || '.' ||x.index_name|| ' rename to '||x.index_name||'_bak;');
end loop;


for x in (select constraint_name from dba_constraints where table_name = table_name1 and owner =owner1) loop
dbms_output.put_line('alter table '|| owner1 || '.' ||table_name1|| ' rename constraint '||x.constraint_name||' to '||x.constraint_name||'_bak;');
end loop;

for i in c2 loop
tmpsql := '';
execute immediate replace(i.txt,'"','''') into tmpsql;
tmpsql := replace(tmpsql ,chr(13),'');
tmpsql := replace(tmpsql ,chr(10),'');
tmpsql := replace(tmpsql ,'"','') || ' -------';
dbms_output.put_line(tmpsql);
end loop;


dbms_output.put_line('alter session force parallel dml parallel 8;');
dbms_output.put_line('alter session force parallel query parallel 8;');
dbms_output.put_line('alter table '||owner1 || '.' ||table_name1||'_P nologging;');
dbms_output.put_line('insert into '|| owner1 || '.' ||table_name1|| '_P  select * from '|| owner1 || '.' ||table_name1|| ';');

for x in (select 'drop materialized view ' ||owner1||'.'|| MVIEW_NAME|| ';' txt from dba_mviews where owner = owner1) loop
dbms_output.put_line(x.txt);
end loop;


dbms_output.put_line('alter table '|| owner1 || '.' ||table_name1|| ' rename to '  ||table_name1||'_bak;');
dbms_output.put_line('alter table '|| owner1 || '.' ||table_name1|| '_P rename to '  ||table_name1||';');

for i in c1 loop
tmpsql := '';
execute immediate replace(i.txt,'"','''') into tmpsql;
dbms_output.put_line(replace(tmpsql||'','"','') || ' local  nologging parallel 32;');
end loop;


for x in (select index_name from dba_indexes where table_name = table_name1 and owner =owner1 and index_type <> 'LOB') loop
dbms_output.put_line('alter index '|| owner1 || '.' ||x.index_name|| ' logging noparallel;');
end loop;

dbms_output.put_line('alter table '|| owner1 || '.' ||table_name1|| ' logging noparallel;');

for i in c4 loop
tmpsql := '';
execute immediate replace(i.txt,'"','''') into tmpsql;
tmpsql := replace(tmpsql ,chr(13),'');
tmpsql := replace(tmpsql ,chr(10),'');
tmpsql := replace(tmpsql||';' ,'"','');
dbms_output.put_line(tmpsql);
end loop;

for i in c3 loop
tmpsql := '';
execute immediate replace(i.txt,'"','''') into tmpsql;
dbms_output.put_line('alter table '||i.owner ||'.'||i.table_name || ' drop constraint '||i.constraint_name||';' );
tmpsql := replace(tmpsql||';','"','');
dbms_output.put_line(tmpsql);
end loop;


for x in (select dbms_metadata.get_ddl('MATERIALIZED_VIEW', mview_name, owner) txt from dba_mviews where owner = 'TEST2') loop
dbms_output.put_line(replace(x.txt ||';','"',''));
end loop;


for x in (select  'SELECT dbms_metadata.get_ddl(''TRIGGER'','''||trigger_name ||''','''||owner||''') FROM dual' txt,TRIGGER_NAME,owner from dba_triggers where table_name = table_name1 and TABLE_OWNER = owner1) loop
tmpsql := '';
execute immediate replace(x.txt,'"','''') into tmpsql;
dbms_output.put_line('drop trigger '||x.owner || '.' ||x.trigger_name ||';' );
tmpsql := replace(tmpsql ,chr(13),'');
tmpsql := replace(tmpsql ||';','"','');
dbms_output.put_line(tmpsql);
end loop;

dbms_output.put_line('exec DBMS_STATS.GATHER_TABLE_STATS('''|| owner1 || ''',''' ||table_name1|| ''' ,cascade => true,method_opt => ''for all columns size auto'',degree => 16);');

end;
/



set serveroutput on
set linesize 1000
set long 99999
begin
for x in (select 'alter '||object_type||' ' ||owner||'.' ||object_name ||' compile;' txt from dba_objects where status='INVALID') loop
dbms_output.put_line(x.txt);
end loop;
end;
/

生成脚本 :(处理逻辑如下:)

  • rename 原表索引,约束
  • 通过get_ddl改写原表生成分区表(UNPARTTABLE_P)
  • nologging + 并行插入数据到分区表
  • UNPARTTABLE_P上建立索引
  • 索引改为logging取消并行
  • UNPARTTABLE_P上新建约束,触发器
  • 关联表上重建约束
  • rename 操作
  • 重新编译失效对象
    set linesize 1000
    set long 99999
    set pages 0
    select /*+ parallel(16) */ count(*) from TEST1.UNPARTTABLE;
    alter index TEST1.PKIDX rename to PKIDX_bak;
    alter index TEST1.IDX2 rename to IDX2_bak;
    alter table TEST1.UNPARTTABLE rename constraint SYS_C0025560 to SYS_C0025560_bak;
    alter table TEST1.UNPARTTABLE rename constraint SYS_C0025561 to SYS_C0025561_bak;
    alter table TEST1.UNPARTTABLE rename constraint PK to PK_bak;
    alter table TEST1.UNPARTTABLE rename constraint SYS_C0025566 to SYS_C0025566_bak;
    alter table TEST1.UNPARTTABLE rename constraint TESTC to TESTC_bak;
    CREATE TABLE TEST1.UNPARTTABLE_P      ( NAME VARCHAR2(30) NOT NULL ENABLE,  MODEL VARCHAR2(30) NOT NULL ENABLE,     STYLESHEET CLOB,    NUM NUMBER)  LOB (STYLESHEET) STORE AS BASICFILE (  TABLESPACE USERS ENABLE STORAGE IN ROW CHUNK 8192 RETENTION   NOCACHE LOGGING   STORAGE(INITIAL 65536
    NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)) PARTITION BY RANGE (num)(PARTITION p1 VALUES LESS THAN (100),PARTITION p2 VALUES LESS THAN (MAXVALUE));
    
    
    
    ,    CONSTRAINT PK PRIMARY KEY (NUM)  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS   STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)  TABLESPACE USERS  ENABLE,   CONSTRAINT TESTC CHECK (num is not null) ENABLE,    FOREIGN KEY (NAME)   REFERENCES TEST1.P1 (NAME) ENABLE   ) SEGMENT CREATION IMMEDIATE   PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255  NOCOMPRESS LOGGING  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)  TABLESPACE USERS 
    
    
    alter session force parallel dml parallel 8;
    alter session force parallel query parallel 8;
    alter table TEST1.UNPARTTABLE_P nologging;
    insert into TEST1.UNPARTTABLE_P  select * from TEST1.UNPARTTABLE;
    alter table TEST1.UNPARTTABLE rename to UNPARTTABLE_bak;
    alter table TEST1.UNPARTTABLE_P rename to UNPARTTABLE;
    
      CREATE UNIQUE INDEX TEST1.PKIDX ON TEST1.UNPARTTABLE (NUM)
      PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
      STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
      PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
      BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
      TABLESPACE USERS  local  nologging parallel 32;
    
      CREATE UNIQUE INDEX TEST1.IDX2 ON TEST1.UNPARTTABLE (NAME)
      PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
      STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
      PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
      BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
      TABLESPACE USERS  global  nologging parallel 32;
    
    
    alter index TEST1.PKIDX logging noparallel;
    alter index TEST1.IDX2 logging noparallel;
    alter table TEST1.UNPARTTABLE logging noparallel;
    ALTER TABLE TEST1.UNPARTTABLE MODIFY (NAME NOT NULL ENABLE);
    ALTER TABLE TEST1.UNPARTTABLE MODIFY (MODEL NOT NULL ENABLE);
    ALTER TABLE TEST1.UNPARTTABLE ADD CONSTRAINT PK PRIMARY KEY (NUM)  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS   STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)  TABLESPACE USERS  ENABLE;
    ALTER TABLE TEST1.UNPARTTABLE ADD CONSTRAINT TESTC CHECK (num is not null) ENABLE;
    alter table TEST1.C1 drop constraint SYS_C0025567;
    
      ALTER TABLE TEST1.C1 ADD FOREIGN KEY (NUM)
          REFERENCES TEST1.UNPARTTABLE (NUM) ENABLE;
    drop trigger TEST1.TESTT;
    
      CREATE OR REPLACE TRIGGER TEST1.TESTT
       BEFORE DELETE ON unpartTable
       BEGIN
          Null;
       END;
      /
    ALTER TRIGGER TEST1.TESTT ENABLE;
    
    exec DBMS_STATS.GATHER_TABLE_STATS('TEST1','UNPARTTABLE' ,cascade => true,method_opt => 'for all columns size auto',degree => 16);
    

操作注意事项:

  • 有物化视图的时候 需要drop 物化视图日志 已经物化视图,否则无法进行rename操作
  • get_ddl出来的表名称需要调整为加上_P的新的分区表,去掉外键,索引等,否则性能有巨大问题。
  • 建立索引的时,原表上为唯一索引的,需要建成global索引,默认模板为local的
  • 失效对象需要重新编译
  • 更新完重新收集统计信息

#在线重定义

操作相对简单很多,需要注意以下地方:

  • 无主键表(使用rowid 定位,options_flag => DBMS_REDEFINITION.CONS_USE_ROWID)
  • 同步数据使用并行可以通过会话级别开并行(alter session force parallel dml parallel 4;
    alter session force parallel query parallel 4;
  • 位图索引只能为local,故CTAS生成的脚本也可做参考
    CREATE TABLE par_table (
         name VARCHAR2(30),
         model VARCHAR2(30),
         STYLESHEET CLOB,
         num NUMBER)
         PARTITION BY RANGE (num)(PARTITION p1 VALUES LESS THAN (100),PARTITION p2 VALUES LESS THAN (MAXVALUE));
    
    select owner, object_type, status, count(*)
    from dba_objects
      where status='INVALID' 
    group by owner, object_type, status
    order by owner, object_type;
    
    
    EXEC Dbms_Redefinition.can_redef_table('TEST2', 'unpartTable');
    
    EXEC DBMS_REDEFINITION.START_REDEF_TABLE('TEST2', 'unpartTable', 'unpartTable_P');
    
    EXEC DBMS_REDEFINITION.SYNC_INTERIM_TABLE('TEST2', 'unpartTable', 'unpartTable_P');
    
    DECLARE
    num_errors PLS_INTEGER;
    BEGIN
    DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS('TEST2', 'unpartTable', 'unpartTable_P',
    DBMS_REDEFINITION.CONS_ORIG_PARAMS, TRUE, TRUE, TRUE, TRUE, num_errors);
    END;
    /
    
    
    select object_name, base_table_name, ddl_txt from   DBA_REDEFINITION_ERRORS;
    
    exec DBMS_REDEFINITION.FINISH_REDEF_TABLE('TEST2', 'unpartTable', 'unpartTable_P');
    

资源管理器限制用户DOP

##以下脚本为使用资源管理器限制用户DOP的案例

用户test 限制parallel 2 其他用户无限制:
exec DBMS_RESOURCE_MANAGER.clear_pending_area;
exec dbms_resource_manager.create_pending_area;
exec dbms_resource_manager.create_consumer_group(consumer_group=>’TESTP1′,comment=>’for test user’);
exec DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(DBMS_RESOURCE_MANAGER.ORACLE_USER,’TEST’,’TESTP1′);
exec dbms_resource_manager.create_plan(plan=>’DOP_plan’,parallel_degree_limit_mth=>’PARALLEL_DEGREE_LIMIT_ABSOLUTE’,comment=>’dop plan’);
exec DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(plan => ‘DOP_plan’,group_or_subplan => ‘TESTP1′,comment => ‘limit dop only’,parallel_degree_limit_p1 => 2);
exec DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(plan => ‘DOP_plan’,group_or_subplan => ‘OTHER_GROUPS’,comment => ‘others’);

exec dbms_resource_manager.validate_pending_area;
exec DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA;

alter system set resource_manager_plan = DOP_plan;

exec dbms_resource_manager_privs.grant_switch_consumer_group(grantee_name => ‘TEST’, consumer_group => ‘TESTP1′, grant_option => FALSE);

测试sql:

select RESOURCE_CONSUMER_GROUP from v$session where sid=(select distinct sid from v$mystat);

select /+ parallel(8)/ t.* from test.t1 t, test.t1 ;

select sid,REQ_DEGREE,degree from v$px_session;