long长事务回滚的模拟与定位
大型事务的回滚 大型事务的回滚产生非常大的代价,不仅锁定需要的资源,并且消耗的CPU和IO,尤其是IO将极为密集。尤其在KILL大型事务之前检查事务究竟有多大可能是必要的,同时我们也需要知道回滚已经进行了多少程度。
一、模拟:
1.删除500多万的数据。
SQL> conn hr/hr;
Connected.
SQL> create table test as select * from dba_objects;
Table created.
SQL> insert into test select * from test;
87055 rows created.
SQL> insert into test select * from test;
174110 rows created.
SQL> insert into test select * from test;
348220 rows created.
SQL> insert into test select * from test;
696440 rows created.
SQL> insert into test select * from test;
1392880 rows created.
SQL> select count(*) from test;
COUNT(*)
----------
2785760
SQL> insert into test select * from test;
2785760 rows created.
SQL> commit;
Commit complete.
SQL> select count(*) from test;
COUNT(*)
----------
5571520
SQL> analyze table test compute statistics;
Table analyzed.
SQL>
SQL> select sid from v$mystat where rownum=1;
SID
----------
30
SQL>
--模拟删除,不commit
SQL> delete test;
5571520 rows deleted.
--另开窗口,查询kill掉会话
SQL> select sid,serial#,sql_id,event,blocking_session from v$session where sid=30;
SID SERIAL# SQL_ID EVENT BLOCKING_SESSION
---------- ---------- ------------- ---------------------------------------------------------------- ----------------
30 165 7qqwcq9td6akt log buffer space 11
SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';
SQL_TEXT
----------------------------------------------------------------------
delete test
SQL> alter system kill session '30,165' immediate;
System altered.
--回到原来窗口验证:
SQL> select count(*) from test;
select count(*) from test
*
ERROR at line 1:
ORA-03135: connection lost contact
Process ID: 28346
Session ID: 30 Serial number: 165
二、定位:
查看回滚进度:
可以通过以下两个视图查看回滚的进度,通过单位时间内恢复的undo block来估算恢复时间:
1. 通过x$ktuxe
alter session set nls_date_format='yyyy-mm-dd hh34:mi:ss';
select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;
SQL> alter session set nls_date_format='yyyy-mm-dd hh34:mi:ss';
Session altered.
SQL> select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;
ADDR KTUXEUSN KTUXESLT KTUXESQN KTUXESIZ SYSDATE
---------------- ---------- ---------- ---------- ---------- -------------------
00007F170E8AAC20 5 11 7784 84438 2017-12-09 14:19:22
SQL>
可以通过KTUXESLT ,KTUXESQN这两个字段,然后用以下脚本回滚得出大概需要的时间:
set serveroutput on
declare
l_start number;
l_end number;
begin
select ktuxesiz
into l_start
from x$ktuxe
where KTUXEUSN = 5
and KTUXESLT = 11;
dbms_lock.sleep(60);
select ktuxesiz
into l_end
from x$ktuxe
where KTUXEUSN = 5
and KTUXESLT = 11;
dbms_output.put_line('time est Day:' ||
round(l_end / (l_start - l_end) / 60 / 24, 2));
end;
/
time est Day:.01
PL/SQL procedure successfully completed.
SQL> SQL>
2. 通过v$fast_start_trancsations 状态为recovering表示恢复中;
select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;
SQL> select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME XID SYSDATE
---------- ---------- ---------- ---------------- -------------- --------------- ---------- ---------------- -------------------
5 11 7784 RECOVERING 123491 179829 120 05000B00681E0000 2017-12-09 14:20:25
通过如下视图观察回滚是串行还是并行回滚的,如下图应是并行恢复的,
V$FAST_START_SERVERS provides information about all the recovery slaves performing parallel transaction recovery.
通过xid字段与v$fast_start_trancsations关联。
select * from v$fast_start_servers where xid in (select XID from V$FAST_START_TRANSACTIONS);
SQL> select * from v$fast_start_servers where xid in (select XID from V$FAST_START_TRANSACTIONS);
STATE UNDOBLOCKSDONE PID XID
----------- -------------- ---------- ----------------
RECOVERING 133950 20 05000B00681E0000
3.查看回滚对象
通过dump undo block方式查看回滚的对象:
首先通过v$fast_start_trancsations的usn字段查询到使用的回滚段。
SQL> select * from v$rollname where usn=5;
USN NAME
---------- ------------------------------
5 _SYSSMU5_898567397$
dump这个undo block,因为dump的文件很大,在查询出对象的object_id后,kill掉这个dump会话。
alter system dump undo block "" XID ;
SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;
。。。。。。。。。
[oracle@wang trace]$ ls -lrt
total 635992
-rw-r----- 1 oracle oinstall 81 Apr 27 2017 DBdb_ora_9045.trm
-rw-r----- 1 oracle oinstall 59 Apr 27 2017 DBdb_mman_9065.trm
-rw-r----- 1 oracle oinstall 60 Apr 27 2017 DBdb_ora_9084.trm
-rw-r----- 1 oracle oinstall 111 Apr 27 2017 DBdb_ora_9099.trm
.......................................
-rw-r----- 1 oracle oinstall 12306 Dec 9 14:23 DBdb_ora_28113.trm
-rw-r----- 1 oracle oinstall 536798705 Dec 9 14:23 DBdb_ora_28113.trc
[oracle@wang trace]$
[oracle@wang trace]$
[oracle@wang trace]$
[oracle@wang trace]$ grep objn DBdb_ora_28113.trc | head -5
* Rec #0x11 slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0x10 slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xf slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xe slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xd slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
[oracle@wang trace]$
[oracle@wang trace]$
可以查询到objn为121192,对应的就是dba_objects的object_id,即mosongtao.rollback_test,正是前边测试的对象。再查询v$session_longops配合username,last_update_time,target,可以大概定位到执行sql_id。
注意:在查询到object_id后手工停掉dump undo block 动作
SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;
^C^C^C^C^C^C
alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation
SQL> SQL> SQL> SQL>
4.查询sql
SQL> select SID,TARGET,SQL_ID,START_TIME,LAST_UPDATE_TIME from v$session_longops where target like '%TEST%' order by LAST_UPDATE_TIME desc;
SID TARGET SQL_ID START_TIME LAST_UPDATE_
---------- ---------------------------------------------------------------- ------------- ------------ ------------
30 HR.TEST 7qqwcq9td6akt 09-DEC-17 09-DEC-17
SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';
SQL_TEXT
-------------------------------------------------------------------------
delete test
SQL>
名称栏目:long长事务回滚的模拟与定位
链接地址:http://scpingwu.com/article/gijihs.html
一、模拟:
1.删除500多万的数据。
SQL> conn hr/hr;
Connected.
SQL> create table test as select * from dba_objects;
Table created.
SQL> insert into test select * from test;
87055 rows created.
SQL> insert into test select * from test;
174110 rows created.
SQL> insert into test select * from test;
348220 rows created.
SQL> insert into test select * from test;
696440 rows created.
SQL> insert into test select * from test;
1392880 rows created.
SQL> select count(*) from test;
COUNT(*)
----------
2785760
SQL> insert into test select * from test;
2785760 rows created.
SQL> commit;
Commit complete.
SQL> select count(*) from test;
COUNT(*)
----------
5571520
SQL> analyze table test compute statistics;
Table analyzed.
SQL>
SQL> select sid from v$mystat where rownum=1;
SID
----------
30
SQL>
--模拟删除,不commit
SQL> delete test;
5571520 rows deleted.
--另开窗口,查询kill掉会话
SQL> select sid,serial#,sql_id,event,blocking_session from v$session where sid=30;
SID SERIAL# SQL_ID EVENT BLOCKING_SESSION
---------- ---------- ------------- ---------------------------------------------------------------- ----------------
30 165 7qqwcq9td6akt log buffer space 11
SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';
SQL_TEXT
----------------------------------------------------------------------
delete test
SQL> alter system kill session '30,165' immediate;
System altered.
--回到原来窗口验证:
SQL> select count(*) from test;
select count(*) from test
*
ERROR at line 1:
ORA-03135: connection lost contact
Process ID: 28346
Session ID: 30 Serial number: 165
二、定位:
查看回滚进度:
可以通过以下两个视图查看回滚的进度,通过单位时间内恢复的undo block来估算恢复时间:
1. 通过x$ktuxe
alter session set nls_date_format='yyyy-mm-dd hh34:mi:ss';
select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;
SQL> alter session set nls_date_format='yyyy-mm-dd hh34:mi:ss';
Session altered.
SQL> select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;
ADDR KTUXEUSN KTUXESLT KTUXESQN KTUXESIZ SYSDATE
---------------- ---------- ---------- ---------- ---------- -------------------
00007F170E8AAC20 5 11 7784 84438 2017-12-09 14:19:22
SQL>
可以通过KTUXESLT ,KTUXESQN这两个字段,然后用以下脚本回滚得出大概需要的时间:
set serveroutput on
declare
l_start number;
l_end number;
begin
select ktuxesiz
into l_start
from x$ktuxe
where KTUXEUSN = 5
and KTUXESLT = 11;
dbms_lock.sleep(60);
select ktuxesiz
into l_end
from x$ktuxe
where KTUXEUSN = 5
and KTUXESLT = 11;
dbms_output.put_line('time est Day:' ||
round(l_end / (l_start - l_end) / 60 / 24, 2));
end;
/
time est Day:.01
PL/SQL procedure successfully completed.
SQL> SQL>
2. 通过v$fast_start_trancsations 状态为recovering表示恢复中;
select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;
SQL> select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME XID SYSDATE
---------- ---------- ---------- ---------------- -------------- --------------- ---------- ---------------- -------------------
5 11 7784 RECOVERING 123491 179829 120 05000B00681E0000 2017-12-09 14:20:25
通过如下视图观察回滚是串行还是并行回滚的,如下图应是并行恢复的,
V$FAST_START_SERVERS provides information about all the recovery slaves performing parallel transaction recovery.
通过xid字段与v$fast_start_trancsations关联。
select * from v$fast_start_servers where xid in (select XID from V$FAST_START_TRANSACTIONS);
SQL> select * from v$fast_start_servers where xid in (select XID from V$FAST_START_TRANSACTIONS);
STATE UNDOBLOCKSDONE PID XID
----------- -------------- ---------- ----------------
RECOVERING 133950 20 05000B00681E0000
3.查看回滚对象
通过dump undo block方式查看回滚的对象:
首先通过v$fast_start_trancsations的usn字段查询到使用的回滚段。
SQL> select * from v$rollname where usn=5;
USN NAME
---------- ------------------------------
5 _SYSSMU5_898567397$
dump这个undo block,因为dump的文件很大,在查询出对象的object_id后,kill掉这个dump会话。
alter system dump undo block "" XID ;
SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;
。。。。。。。。。
[oracle@wang trace]$ ls -lrt
total 635992
-rw-r----- 1 oracle oinstall 81 Apr 27 2017 DBdb_ora_9045.trm
-rw-r----- 1 oracle oinstall 59 Apr 27 2017 DBdb_mman_9065.trm
-rw-r----- 1 oracle oinstall 60 Apr 27 2017 DBdb_ora_9084.trm
-rw-r----- 1 oracle oinstall 111 Apr 27 2017 DBdb_ora_9099.trm
.......................................
-rw-r----- 1 oracle oinstall 12306 Dec 9 14:23 DBdb_ora_28113.trm
-rw-r----- 1 oracle oinstall 536798705 Dec 9 14:23 DBdb_ora_28113.trc
[oracle@wang trace]$
[oracle@wang trace]$
[oracle@wang trace]$
[oracle@wang trace]$ grep objn DBdb_ora_28113.trc | head -5
* Rec #0x11 slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0x10 slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xf slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xe slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xd slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
[oracle@wang trace]$
[oracle@wang trace]$
可以查询到objn为121192,对应的就是dba_objects的object_id,即mosongtao.rollback_test,正是前边测试的对象。再查询v$session_longops配合username,last_update_time,target,可以大概定位到执行sql_id。
注意:在查询到object_id后手工停掉dump undo block 动作
SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;
^C^C^C^C^C^C
alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation
SQL> SQL> SQL> SQL>
4.查询sql
SQL> select SID,TARGET,SQL_ID,START_TIME,LAST_UPDATE_TIME from v$session_longops where target like '%TEST%' order by LAST_UPDATE_TIME desc;
SID TARGET SQL_ID START_TIME LAST_UPDATE_
---------- ---------------------------------------------------------------- ------------- ------------ ------------
30 HR.TEST 7qqwcq9td6akt 09-DEC-17 09-DEC-17
SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';
SQL_TEXT
-------------------------------------------------------------------------
delete test
SQL>
名称栏目:long长事务回滚的模拟与定位
链接地址:http://scpingwu.com/article/gijihs.html