修改props$的global_db_name为空后的恢复过程

1. 背景

今天惊闻某客户将props$表的global_db_name置为空了,我这里测试一下碰到这个情况如何恢复

2. 模拟故障过程

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE    11.2.0.4.0      Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production

SQL> show parameter db_name

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_name                              string      t12c

SQL> update props$ set value$ = null where name = 'GLOBAL_DB_NAME';

1 row updated.

SQL> commit;

Commit complete.

SQL> select * from props$ where name = 'GLOBAL_DB_NAME';

NAME             VALUE$        COMMENT$ 
---------------- ------------- ------------------------
GLOBAL_DB_NAME                 Global database name

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 1653518336 bytes
Fixed Size                  2253784 bytes
Variable Size             956304424 bytes
Database Buffers          687865856 bytes
Redo Buffers                7094272 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [18062], [], [], [], [], [], [], [], [], [], [], []
Process ID: 24186
Session ID: 66 Serial number: 3

好了,故障已经模拟,启动数据库报18062错误。alert日志报错如下:

Mon May 22 13:16:08 2017
QMNC started with pid=41, OS id=24231 
Errors in file /data/oracle/diag/rdbms/t12c/t12c/trace/t12c_ora_24186.trc  (incident=15761):
ORA-00600: internal error code, arguments: [18062], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /data/oracle/diag/rdbms/t12c/t12c/incident/incdir_15761/t12c_ora_24186_i15761.trc
Mon May 22 13:16:11 2017
Thread 1 advanced to log sequence 311 (LGWR switch)
  Current log# 2 seq# 311 mem# 0: /data/oracle/oradata/t12c/redo02.log
Archived Log entry 602 added for thread 1 sequence 310 ID 0xb275da3c dest 1:
******************************************************************
LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2
******************************************************************
Mon May 22 13:16:13 2017
db_recovery_file_dest_size of 4182 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Mon May 22 13:16:13 2017
Dumping diagnostic data in directory=[cdmp_20170522131613], requested by (instance=1, osid=24186), summary=[incident=15761].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /data/oracle/diag/rdbms/t12c/t12c/trace/t12c_ora_24186.trc:
ORA-00600: internal error code, arguments: [18062], [], [], [], [], [], [], [], [], [], [], []
Errors in file /data/oracle/diag/rdbms/t12c/t12c/trace/t12c_ora_24186.trc:
ORA-00600: internal error code, arguments: [18062], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 24186): terminating the instance due to error 600
Mon May 22 13:16:14 2017
Instance terminated by USER, pid = 24186
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (24186) as a result of ORA-1092
Mon May 22 13:16:14 2017
ORA-1092 : opitsk aborting process

3. 恢复过程

$ sqlplus "/ as sysdba"

SQL*Plus: Release 11.2.0.4.0 Production on Mon May 22 13:17:00 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount
ORACLE instance started.

Total System Global Area 1653518336 bytes
Fixed Size                  2253784 bytes
Variable Size             956304424 bytes
Database Buffers          687865856 bytes
Redo Buffers                7094272 bytes
Database mounted.

把数据库启动到mount状态,当前启动数据库的sqlplus会话不退出,检查这个sqlplus的shadow进程pid

$ ps -ef | grep LOCAL | grep -v grep
oracle   24389 24253  0 13:17 ?        00:00:00 oraclet12c (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

利用GBD跟踪shadow进程

$ gdb $ORACLE_HOME/bin/oracle 24389
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-42.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /data/oracle/product/11.2.0/db_1/bin/oracle...(no debugging symbols found)...done.
Attaching to program: /data/oracle/product/11.2.0/db_1/bin/oracle, process 24389
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libodm11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libodm11.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libcell11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libcell11.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libskgxp11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libskgxp11.so
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libnnz11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libnnz11.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libclsra11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libclsra11.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libdbcfg11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libdbcfg11.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libhasgen11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libhasgen11.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libskgxn2.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libskgxn2.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libocr11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libocr11.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libocrb11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libocrb11.so
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libocrutl11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libocrutl11.so
Reading symbols from /usr/lib64/libaio.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libaio.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /usr/lib64/libnuma.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libnuma.so.1
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /data/oracle/product/11.2.0/db_1/lib/libnque11.so...(no debugging symbols found)...done.
Loaded symbols for /data/oracle/product/11.2.0/db_1/lib/libnque11.so

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fff11b0d000
0x000000364880d9b0 in __read_nocancel () from /lib64/libpthread.so.0
(gdb) 
``
对kokiasg函数设置断点

(gdb) break kokiasg
Breakpoint 1 at 0x1503790

继续gdb的debugging

(gdb) continue

这时侯,GDB处理执行等待状态,回到刚刚启动数据库到mount状态的sqlplus会话,执行

SQL> alter database open;

sqlplus会话处于执行等待状态,未返回,查看gdb会话,已经返回,可以输入

(gdb) continue
Continuing.

Breakpoint 1, 0x0000000001503790 in kokiasg ()
(gdb) 

这时侯另起一个sqlplus会话,登录数据库

$ sqlplus "/ as sysdba"

SQL*Plus: Release 11.2.0.4.0 Production on Mon May 22 13:18:36 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select status from v$instance;

STATUS
------------
OPEN

SQL> update global_name set global_name = 't12c';

1 row updated.

SQL> commit;

Commit complete.

然后在gdb中,输入kill

(gdb) kill
Kill the program being debugged? (y or n) y
(gdb) quit

这时侯,第一个sqlplus会话报错如下:

SQL> startup mount
ORACLE instance started.

Total System Global Area 1653518336 bytes
Fixed Size                  2253784 bytes
Variable Size             956304424 bytes
Database Buffers          687865856 bytes
Redo Buffers                7094272 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 24389
Session ID: 66 Serial number: 3

重启数据库即可正常启动

$ sqlplus "/ as sysdba"

SQL*Plus: Release 11.2.0.4.0 Production on Mon May 22 13:19:58 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 1653518336 bytes
Fixed Size                  2253784 bytes
Variable Size             956304424 bytes
Database Buffers          687865856 bytes
Redo Buffers                7094272 bytes
Database mounted.
Database opened.
SQL> 

关于紫砂壶

感悟技术人生
此条目发表在故障处理分类目录,贴了标签。将固定链接加入收藏夹。