ORA-08103错误的模拟和诊断

1. 概述

ORA-8103错误发生的概率较高,oerr的错误解释为:
$ oerr ora 8103
08103, 00000, “object no longer exists”
// *Cause: The object has been deleted by another user since the operation
// began, or a prior incomplete recovery restored the database to
// a point in time during the deletion of the object.
// *Action: Delete the object if this is the result of an incomplete
// recovery.

在 Master Note for Handling Oracle Database Corruption Issues (文档 ID 1088018.1)中对ORA-8103错误的描述如下:

The object has been deleted by another user since the operation began.
If the error is reproducible, following may be the reasons:-
a.) The header block has an invalid block type.
b.) The data_object_id (seg/obj) stored in the block is different than the data_object_id stored in the segment header.
See dba_objects.data_object_id and compare it to the decimal value stored in the block (field seg/obj).

文档解释为:ORA-8103错误如果只发生1次,那表明该会话操作前,对象被其它会话删除了。
如果ORA-8103错误重复发生,可能有两个原因:
a.) 段头是坏块
b.) 数据块上存储的data_object_id和段头上存储的data_object_id不一致。
针对这两个原因,以下做模拟测试

2. 环境准备

SQL> create table t1 tablespace users as select * from all_objects;
Table created.

SQL> alter system checkpoint;
System altered.

SQL> select data_object_id from dba_objects where object_name = 'T1' and owner = 'SYS';
DATA_OBJECT_ID
--------------
         87431

SQL> select segment_type, HEADER_FILE, HEADER_BLOCK from dba_segments where owner = 'SYS' and segment_name = 'T1';

SEGMENT_TYPE       HEADER_FILE HEADER_BLOCK
------------------ ----------- ------------
TABLE                        4          170

SQL> set pagesize 999
SQL> select FILE_ID, block_id, blocks from dba_extents where owner = 'SYS' and segment_name = 'T1';

   FILE_ID   BLOCK_ID     BLOCKS
---------- ---------- ----------
         4        168          8
         4        176          8
         4        184          8
         4        192          8
         4        200          8
         4        208          8
         4        216          8
         4        224          8
         4        232          8
         4        240          8
         4        248          8
         4        256          8
         4        264          8
         4        272          8
         4        280          8
         4        288          8
         4        384        128
         4        512        128
         4        640        128
         4        768        128
         4        896        128
         4       1024        128
         4       1152        128
         4       1280        128
         4       1408        128

 25 rows selected.

SQL> column name format a40
SQL> select name, bytes from v$datafile where file# = 4;

NAME                                          BYTES
---------------------------------------- ----------
/data/oracle/oradata/TEST1/users01.dbf     13107200

SQL> show parameter block_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_block_size                        integer     8192

编译BBED : 略,参考相关文档

3. 模拟段头块损坏

BBED在10g和11gR2上不支持map段头结构

3.1 测试环境

SQL> select data_object_id from dba_objects where object_name = 'T1' and owner = 'SYS';

DATA_OBJECT_ID
--------------
         51789

SQL> select segment_type, HEADER_FILE, HEADER_BLOCK from dba_segments where owner = 'SYS' and segment_name = 'T1';

SEGMENT_TYPE       HEADER_FILE HEADER_BLOCK
------------------ ----------- ------------
TABLE                        4           59

3.2 修改段头为坏块

$ bbed filename=/u01/oradata/test/users01.dbf blocksize=8192 password=blockedit

BBED> set block 59
BLOCK# 59

BBED> dump
File: /u01/oradata/test/users01.dbf (0)
Block: 59 Offsets: 0 to 511 Dba:0x00000000
------------------------------------------------------------------------
23a20000 3b000001 9c080600 00000104 46620000 00000000 00000000 00000000
00000000 15000000 00030000 9c0a0000 14000000 3d000000 80000000 46030001
00000000 14000000 00000000 bd020000 00000000 00000000 00000000 14000000

BBED> set mode edit
MODE Edit

BBED> m /x 24a2 offset 0
File: /u01/oradata/test/users01.dbf (0)
Block: 59 Offsets: 0 to 511 Dba:0x00000000
------------------------------------------------------------------------
24a20000 3b000001 9c080600 00000104 46620000 00000000 00000000 00000000
00000000 15000000 00030000 9c0a0000 14000000 3d000000 80000000 46030001
00000000 14000000 00000000 bd020000 00000000 00000000 00000000 14000000

BBED> sum apply
Check value for File 0, Block 59:
current = 0x6241, required = 0x6241

BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oradata/test/users01.dbf
BLOCK = 59

Block 59 is corrupt
Corrupt block relative dba: 0x0100003b (file 0, block 59)
Fractured block found during verification
Data in bad block:
type: 36 format: 2 rdba: 0x0100003b
last change scn: 0x0000.0006089c seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x089c2301
check value in block header: 0x6241
computed block checksum: 0x0

DBVERIFY - Verification complete

Total Blocks Examined : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 1
Total Blocks Influx : 2

SQL> alter system flush buffer_cache;
System altered.

SQL> select count(1) from t1;
select count(1) from t1
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 59)
ORA-01110: data file 4: '/u01/oradata/test/users01.dbf'

如果段头为坏块,则读取直接报坏块。

3.3 修改段头上的data_object_id

t1的data_object_id为51789,转成十六进制为:ca4d

BBED> dump
 File: /u01/oradata/test/users01.dbf (0)
 Block: 59               Offsets:    0 to  511           Dba:0x00000000
------------------------------------------------------------------------
 23a20000 3b000001 9c080600 00000104 46620000 00000000 00000000 00000000 
 00000000 15000000 00030000 9c0a0000 14000000 3d000000 80000000 46030001 
 00000000 14000000 00000000 bd020000 00000000 00000000 00000000 14000000 
 3d000000 80000000 46030001 00000000 14000000 00000000 bd020000 09030001 
 09030001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 01000000 00200000 00000000 34140000 
 00000000 3a000001 01000000 0a030001 3a000001 00000000 00000000 00000000 
 00000000 00000000 15000000 00000000 4dca0000 00000010 39000001 08000000 
                                     |-> offset=272
 41000001 08000000 49000001 08000000 51000001 08000000 59000001 08000000 
 61000001 08000000 69000001 08000000 71000001 08000000 79000001 08000000 
 81000001 08000000 89000001 08000000 91000001 08000000 99000001 08000000 
 a1000001 08000000 a9000001 08000000 b1000001 08000000 09010001 80000000 
 89010001 80000000 09020001 80000000 89020001 80000000 09030001 80000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>

在dump里搜索,offset=272的值是 4dca,这个值为段头上的data_object_id

BBED> m /x 4dcb offset 272
File: /u01/oradata/test/users01.dbf (0)
Block: 59 Offsets: 272 to 783 Dba:0x00000000
------------------------------------------------------------------------
4dcb0000 00000010 39000001 08000000 41000001 08000000 49000001 08000000
51000001 08000000 59000001 08000000 61000001 08000000 69000001 08000000
71000001 08000000 79000001 08000000 81000001 08000000 89000001 08000000
91000001 08000000 99000001 08000000 a1000001 08000000 a9000001 08000000

BBED> dump offset 0
 File: /u01/oradata/test/users01.dbf (0)
 Block: 59               Offsets:    0 to  511           Dba:0x00000000
------------------------------------------------------------------------
 23a20000 3b000001 9c080600 00000104 46620000 00000000 00000000 00000000
 00000000 15000000 00030000 9c0a0000 14000000 3d000000 80000000 46030001
 00000000 14000000 00000000 bd020000 00000000 00000000 00000000 14000000
 3d000000 80000000 46030001 00000000 14000000 00000000 bd020000 09030001
 09030001 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 01000000 00200000 00000000 34140000
 00000000 3a000001 01000000 0a030001 3a000001 00000000 00000000 00000000
 00000000 00000000 15000000 00000000 4dcb0000 00000010 39000001 08000000
                                     |->已修改
 41000001 08000000 49000001 08000000 51000001 08000000 59000001 08000000
 61000001 08000000 69000001 08000000 71000001 08000000 79000001 08000000
 81000001 08000000 89000001 08000000 91000001 08000000 99000001 08000000
 a1000001 08000000 a9000001 08000000 b1000001 08000000 09010001 80000000
 89010001 80000000 09020001 80000000 89020001 80000000 09030001 80000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

BBED> sum apply
Check value for File 0, Block 59:
current = 0x6346, required = 0x6346

BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oradata/test/users01.dbf
BLOCK = 59

DBVERIFY - Verification complete

Total Blocks Examined : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 0
Total Blocks Influx : 0

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(1) from t1;
select count(1) from t1
*
ERROR at line 1:
ORA-08103: object no longer exists

总结:段头上的data_object_id出现问题,导致ORA-08103错误发生

4. 模拟数据块data_object_id与段头不一致

4.1 相关信息

SQL> set linesize 180
SQL> select owner, object_name,
  2         dbms_rowid.rowid_relative_fno(rowid) as fno,
  3         dbms_rowid.rowid_block_number(rowid) as bno,
  4         dbms_rowid.rowid_row_number(rowid) as rno,
  5         to_char(dbms_utility.make_data_block_address(dbms_rowid.rowid_relative_fno(rowid),
  6                 dbms_rowid.rowid_block_number(rowid)), 'xxxxxxxxxx') as dba
  7   from t1 where object_id = 7840;

OWNER   OBJECT_NAME             FNO BNO RNO DBA
------- ----------------------- --- --- --- -----------
SYS     V_$DIAG_EM_TARGET_INFO    4 275  29     1000113

BBED> set block 275
        BLOCK#          275

BBED> map
 File: /data/oracle/oradata/TEST1/users01.dbf (0)
 Block: 275                                   Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 96 bytes                     @20
 struct kdbh, 14 bytes                      @124
 struct kdbt[1], 4 bytes                    @138
 sb2 kdbr[74]                               @142
 ub1 freespace[891]                         @290
 ub1 rowdata[7007]                          @1181
 ub4 tailchk                                @8188    

BBED> p ktbbh
struct ktbbh, 96 bytes                      @20
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24                                     --> 这值存储data_object_id
      ub4 ktbbhsg1                          @24       0x00015587                    --> 0x00015587转成十进制为:87431
      ub4 ktbbhod1                          @24       0x00015587
   struct ktbbhcsc, 8 bytes                 @28
      ub4 kscnbas                           @28       0x0012c1ea
......

4.2 BBED模拟data_object_id错误

把这个值改为:87432,十六进制为:0x00015588

BBED> set mode edit
        MODE            Edit

BBED> m /x 8855 offset 24
BBED> m /x 0100 offset 26
BBED> p ktbbh.ktbbhsid
union ktbbhsid, 4 bytes                     @24
   ub4 ktbbhsg1                             @24       0x00015588
   ub4 ktbbhod1                             @24       0x00015588

BBED> sum apply
Check value for File 0, Block 275:
current = 0xa57b, required = 0xa57b

BBED> verify
DBVERIFY - Verification starting
FILE = /data/oracle/oradata/TEST1/users01.dbf
BLOCK = 275

DBVERIFY - Verification complete

Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

BBED改完后,先把buffer_cache刷掉后,再测试

SQL> alter system flush buffer_cache;
System altered.

SQL> select * from t1 where object_id = 7840;
select * from t1 where object_id = 7840
              *
ERROR at line 1:
ORA-08103: object no longer exists

SQL> desc t1
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                     NOT NULL VARCHAR2(30)
 OBJECT_NAME                               NOT NULL VARCHAR2(30)
 SUBOBJECT_NAME                                     VARCHAR2(30)
 OBJECT_ID                                 NOT NULL NUMBER
 DATA_OBJECT_ID                                     NUMBER
 OBJECT_TYPE                                        VARCHAR2(19)
 CREATED                                   NOT NULL DATE
 LAST_DDL_TIME                             NOT NULL DATE
 TIMESTAMP                                          VARCHAR2(19)
 STATUS                                             VARCHAR2(7)
 TEMPORARY                                          VARCHAR2(1)
 GENERATED                                          VARCHAR2(1)
 SECONDARY                                          VARCHAR2(1)
 NAMESPACE                                 NOT NULL NUMBER
 EDITION_NAME                                       VARCHAR2(30)

4.3 诊断

相关诊断的trace event

alter system set events '8103 trace name errorstack level 3';
alter system set events '10236 trace name context forever, level 1';            # ORA-10236: library cache manager
alter system set events '10200 trace name context forever, level 1';            # ORA-10200: consistent read buffer status
alter session set tracefile_identifier='ORA8103';

SQL> alter system set events '8103 trace name errorstack level 3';
System altered.
SQL> alter system set events '10236 trace name context forever, level 1';
System altered.
SQL> alter system set events '10200 trace name context forever, level 1';
System altered.
SQL> alter session set tracefile_identifier='ORA8103';
Session altered.
SQL> select * from t1 where object_id = 7840;
select * from t1 where object_id = 7840
              *
ERROR at line 1:
ORA-08103: object no longer exists

在dump目录里找名字为ORA8103的trc文件,当前为:TEST1_ora_24728_ORA8103.trc

*** SESSION ID:(194.2915) 2016-11-16 15:18:03.689
OBJD MISMATCH typ=6, seg.obj=87431, diskobj=87432, dsflg=100001, dsobj=87431, tid=87431, cls=1
KTRVAC: obj=87432, seg. obj=87431, seg rdba=10000aa

*** 2016-11-16 15:18:03.690
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-08103: object no longer exists
----- Current SQL Statement for this session (sql_id=9cwg6xafw1a52) -----
select * from t1 where object_id = 7840

从以上OBJD MISMATCH可以看出:

diskobj=87432(obj=87432)    # 块上的data_object_id=87432
seg. obj=87431              # 段头上的data_object_id=87431
seg rdba=10000aa            # 段头位置

从数据字典里可以判断,段头上的data_object_id正确,错的是数据块上的data_object_id

SQL> select data_object_id from dba_objects where object_name = 'T1' and owner = 'SYS';
DATA_OBJECT_ID
--------------
         87431

很不幸,在这个场景下,trc里没有跟踪到坏块,里面跟踪的块都是正常块。如果找到坏块的地址,只需要用BBED修改即可。
找不到坏块地址只能采用另外的办法处理

后来测试,通过Event 10231 + 10046,设置db_file_multiblock_read_count=1,可以看到扫描的坏块

SQL> alter system set db_file_multiblock_read_count=1 scope=both;

SQL> ALTER SESSION SET EVENTS '10231 TRACE NAME CONTEXT FOREVER, LEVEL 10';
Session altered.

SQL> ALTER SESSION SET events '10046 trace name context forever, level 12';
Session altered.

SQL> select * from t1 where object_id = 7840;
select * from t1 where object_id = 7840
*
ERROR at line 1:
ORA-08103: object no longer exists

相关跟踪文件如下:

......
*** 2016-11-16 16:31:41.143
WAIT #139971881361576: nam='SQL*Net message from client' ela= 7564768 driver id=1650815232 #bytes=1 p3=0 obj#=87431 tim=1479285101143555
CLOSE #139971881361576:c=0,e=21,dep=0,type=0,tim=1479285101143768
=====================
PARSING IN CURSOR #139971881361576 len=39 dep=0 uid=0 oct=3 lid=0 tim=1479285101143992 hv=2646648994 ad='2deb403f8' sqlid='9cwg6xafw1a52'
select * from t1 where object_id = 7840
END OF STMT
PARSE #139971881361576:c=0,e=125,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3617692013,tim=1479285101143989
EXEC #139971881361576:c=999,e=88,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3617692013,tim=1479285101144156
WAIT #139971881361576: nam='SQL*Net message to client' ela= 4 driver id=1650815232 #bytes=1 p3=0 obj#=87431 tim=1479285101144219
WAIT #139971881361576: nam='db file sequential read' ela= 50 file#=4 block#=275 blocks=1 obj#=87431 tim=1479285101145788

*** SESSION ID:(194.2925) 2016-11-16 16:31:41.145
OBJD MISMATCH typ=6, seg.obj=87431, diskobj=87432, dsflg=100101, dsobj=87431, tid=87431, cls=1
FETCH #139971881361576:c=1000,e=1649,p=1,cr=102,cu=0,mis=0,r=0,dep=0,og=1,plh=3617692013,tim=1479285101145914
STAT #139971881361576 id=1 cnt=0 pid=0 pos=1 obj=87431 op='TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 time=6 us cost=337 size=2212 card=14)'
WAIT #139971881361576: nam='SQL*Net break/reset to client' ela= 4 driver id=1650815232 break?=1 p3=0 obj#=87431 tim=1479285101146028
WAIT #139971881361576: nam='SQL*Net break/reset to client' ela= 133 driver id=1650815232 break?=0 p3=0 obj#=87431 tim=1479285101146178

OBJD MISMATCH报错前一个nam=’db file sequential read’,读取的数据块是file#=4 block#=275,这个块就是逻辑损坏的块。

4.4 通过rowid保留数据

参考:Extract rows from a CORRUPT table creating ROWID from DBA_EXTENTS (文档 ID 422547.1)

SQL> create table t2 tablespace users as select * from t1 where 1 = 2 ;

SQL> desc t1
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                     NOT NULL VARCHAR2(30)
 OBJECT_NAME                               NOT NULL VARCHAR2(30)
 SUBOBJECT_NAME                                     VARCHAR2(30)
 OBJECT_ID                                 NOT NULL NUMBER
 DATA_OBJECT_ID                                     NUMBER
 OBJECT_TYPE                                        VARCHAR2(19)
 CREATED                                   NOT NULL DATE
 LAST_DDL_TIME                             NOT NULL DATE
 TIMESTAMP                                          VARCHAR2(19)
 STATUS                                             VARCHAR2(7)
 TEMPORARY                                          VARCHAR2(1)
 GENERATED                                          VARCHAR2(1)
 SECONDARY                                          VARCHAR2(1)
 NAMESPACE                                 NOT NULL NUMBER
 EDITION_NAME                                       VARCHAR2(30)

set serveroutput on
set concat off

DECLARE
  nrows        number;
  rid          rowid;
  dobj         number;
  ROWSPERBLOCK number;
BEGIN
  ROWSPERBLOCK := 2000;			-- need modify
  nrows        := 0;

  select data_object_id
    into dobj
    from dba_objects
   where owner = '&&table_owner'
     and object_name = '&&table_name';

  for i in (select relative_fno, block_id, block_id + blocks - 1 totblocks
              from dba_extents
             where owner = '&&table_owner'
               and segment_name = '&&table_name'
             order by extent_id) loop
    for br in i.block_id .. i.totblocks loop
      for j in 1 .. ROWSPERBLOCK loop
        begin
          rid := dbms_rowid.ROWID_CREATE(1, dobj, i.relative_fno, br, j - 1);

          insert into t2			-- new table
            (OWNER,OBJECT_NAME,SUBOBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID,OBJECT_TYPE,CREATED,LAST_DDL_TIME,TIMESTAMP,
STATUS,TEMPORARY,GENERATED,SECONDARY,NAMESPACE,EDITION_NAME)
            select /*+ ROWID(A) */
             OWNER,OBJECT_NAME,SUBOBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID,OBJECT_TYPE,CREATED,LAST_DDL_TIME,TIMESTAMP,
STATUS,TEMPORARY,GENERATED,SECONDARY,NAMESPACE,EDITION_NAME
              from "&&table_owner"."&&table_name" A
             where rowid = rid;

          if sql%rowcount = 1 then
            nrows := nrows + 1;
          end if;
          if (mod(nrows, 10000) = 0) then
            commit;
          end if;

        exception
          when others then
            null;
        end;
      end loop;
    end loop;
  end loop;
  COMMIT;
  dbms_output.put_line('Total rows: ' || to_char(nrows));
END;
/

SQL> select count(1) from t2;

  COUNT(1)
----------
     84334

SQL> select count(1) from all_objects;

  COUNT(1)
----------
     84409

BBED> set block 275
        BLOCK#          275

BBED> map
 File: /data/oracle/oradata/TEST1/users01.dbf (0)
 Block: 275                                   Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 96 bytes                     @20
 struct kdbh, 14 bytes                      @124
 struct kdbt[1], 4 bytes                    @138
 sb2 kdbr[74]                               @142                        --> 这个数据块上有74行记录
 ub1 freespace[891]                         @290
 ub1 rowdata[7007]                          @1181
 ub4 tailchk                                @8188

经过测试,以上方法丢失了坏块上的数据。

关于紫砂壶

感悟技术人生
此条目发表在ORA-分类目录,贴了标签。将固定链接加入收藏夹。

5 则回应给 ORA-08103错误的模拟和诊断

  1. 匿名说:

    感谢分享…….

  2. 匿名说:

    感谢你的分享………

  3. 匿名说:

    透彻

  4. 越yan说:

    很多这个错误,为应用程序问题,比如普通表当临时表用,用完truncate,另一个session也这样搞,搞完后,data objid必然变化,并发度越高,经常这种错误越明显。

  5. 紫砂壶说:

    应用错误好解决,麻烦的是非应用错误,已经碰到过很多次了

评论已关闭。