Monday, December 30, 2019

Roll-Forward a Standby Database using RMAN Incremental backup

In this post we'll see about Roll-Forwarding a physical standby database using RMAN Incremental backup in a detailed step by step process.


  1. Check the status of standby database
  2. Identify the issue
  3. Identify required SCN for recovery
  4. SCN based Incremental backup for standby database
  5. Catalog the backup
  6. Recover standby database
  7. Restore standby controlfile
  8. Check Incarnation
  9. Switch database to copy
  10. Report Schema
  11. Catalog datafiles
  12. Switch datafile copy
  13. List copy of database
  14. Final check
  15. Recover standby database
  16. Start MRP
  17. Log switch testing


Check the status of standby database

We created a standby database using recent backup of our production database, but somehow some of the archive logs are missed and hence Primary and Standby databases are not in synchronization. So now we'll fix this situation using Roll-forward backup based on SCN of the standby database to recover the standby database from required point in time.

In the below SS, we can see the latest log file received at the standby site is 14295.

select max(sequence#) latest_archive_log from v$log_history;

On the other hand, log's available at the standby site starts from sequence '14307'. Almost 11 archive logs are missed and this prevents the DB synchronization.

Now as a DBA it's our duty to fix the issue and normalize the situation. So here I used SCN based recovery method on the standby database.

Check the status of the archive logs at the standby site using the below query
select THREAD#, SEQUENCE#, APPLIED, REGISTRAR from v$archived_log;

Check the MRP process status as below
select PROCESS, SEQUENCE#, STATUS from v$MANAGED_STANDBY;

Identify the issue

Start the MRP process
recover managed standby database disconnect;

Here it is very clear that it is waiting for the log sequence 14296.

So now we'll start the recovery process in a detailed step by step manner.

Identify required SCN for recovery

Now we need to check the current scn and the minimum scn from the standby database to identify from which point we need to restore the database.

set numwidth 50;
select current_scn from v$database;
select min(checkpoint_change#) from v$datafile_header where file# not in (select file# from v$datafile where enabled='READ ONLY');


Now we need to choose the least SCN value from the above two, and should start the recovery from that scn value.

SCN based  Incremental backup for standby database

Now start the incremental backup process using the identified SCN value at the Primary site as below.

backup incremental from scn 'scn value' database format 'your arbitrary path and desired format' tag 'FORSTANDBY';

Once the backup is started, you can track the progress using the below query
SELECT sid, serial#, context,  sofar, totalwork, round(sofar/totalwork*100,2) "% Complete"
FROM gv$session_longops
WHERE opname LIKE 'RMAN%'
AND opname NOT LIKE '%aggregate%'
AND totalwork != 0
AND sofar <> totalwork;



Once the backup has been completed successfully, start the recover process at the standby site as below.


Catalog the backup

Copy/Move the backup pieces to the standby site and catalogue the backup pieces for the standby database to accept them.

catalog start with 'your arbitrary location at the standby site';


Recover standby database

Now start the recovery process as 
recover database noredo;
 

You can monitor the recovery process using the same query given above.


The restoration process has been completed successfully. Now we need to restore the stanby control file, since the datafiles are recently restored.

Restore standby controlfile

Backup the control file from primary site for the standby database.

backup current controlfile for standby format 'arbitrary path and format';

Before restoring the standby controlfile, take a spool of the datafile names for crosschecking purpose after the controlfile has been recreated.

spool datafile_names.txt
set lines 200
col namr for a50
select file#, name from v$datafile order by file#;

Now down and start the standby database with nomount option to restore the controlfile from the backup we took.

restore standby controlfile from 'arbitrary path';


After controlfile has been recreated, compare the datafile names with the spool file output and crosscheck they remains the same. If you find any difference in datafile path, directly move to section switch datafile copy.

Check Incarnation

Now shutdown, mount the standby databse and check incarnation of both the databases.

list incarnation;

@Primary

@Standby

Now at the Primary site, check any datafile has been created after the SCN we identified

Switch Database to Copy

Here no new files has been created. So now we can proceed with copying the database.

At the standby site initiate the database copy.

switch database to copy;

 In my case, datafile 1 needs recovery.

So now check the current scn and minimum scn of the standby database in order to identify any more datafiles needs to be recovered from primary site.

Since the minimum checkpoint value is zero, no more recovery is needed from primary site. So now we need to check why this issue is occurring and should follow an appropriate solution to fix this issue.

As a first step, clear the log groups.

Here it is clear that the logs are not getting applied, but confirm the logs are getting shipped to standby from primary.

From Primary:

From Standby:


Since the logs are getting transferred and not applied, I checked the alert log of the standby database and found the below.

Here the datafiles are said to be missing. That's not the case, the thing is files are there in the desired location but they are unknown to our database. Now we need to educate the database that the files belongs to him.

Report Schema

To confirm that database doesn't know about those reported datafile run a report on schema at the standby site.


Catalog datafiles

From the above SS, you can see most of the datafiles size are zero, and hence the database couldn't identify them. So now we need to catalog those datafiles, so that database will come to know about them.

catalog start with 'dbf file location';

 Since it couldn't find any files from the above mentioned path's, I simply gave the diskgroup name and now it identifies the unknown files and the same has been cataloged to the database.



Switch datafile copy

Now we need to copy the datafiles since they are newly cataloged with the database. Before that we need to identify which files needs to copied.

set lines 1000 pages 500
col name for a55
col creation_time for a20
col checkpoint_time for a20
select file#, name, creation_time, checkpoint_time, checkpoint_change# from v$datafile_copy;
 

select file#, name, recover, fuzzy, checkpoint_change#, checkpoint_time, creation_change#, creation_time, error from v$datafile_header order by file#;

So above we have some files addressed as file not found. That files alone needs to be copied. So now we'll segregate that file sequences alone for copying.

select file# from v$datafile_header where ERROR='FILE NOT FOUND' order by file# asc;

Now we can switch those datafile's to copy as below.

switch datafile dbf1,dbf2,........,dbfn to copy;

Now run a report on schema to check all the files are known to the database.


List copy of database

In the above SS all the datafiles are having reasonable space, which means now our standby database is aware of all the datafiles belongs to it.

We can confirm this by list the copy of database.

list copy of database;

Since no rows are returned for the above command, we are good to proceed further. If any rows returned, then we need to run a report on the schema, restore, switch those datafiles again as we did before.

Recover Standby database

Now its time to recover the database as we've recovered the database just now.

recover standby database;

From the above SS, it is clear that the standby database is now waiting for the new archive log sequence 14367 which means all the archive gaps are filled/applied at the standby database.

Start MRP

Now it is fine to start the MRP process.

alter database recover managed standby database disconnect;

@Primary:

@Standby:
 

col status for a10
select sequence#, applied, status from v$archived_log order by sequence# asc;


Log switch testing

Now we can perform log switch test to confirm everything in its rightful place and issue is solved.

@Primary:

@Standby:


Also you can use the below query to check the status

select al.thrd "Thread", almax "Last Seq Received", lhmax "Last Seq Applied" from (select thread# thrd, max(sequence#) almax from v$archived_log where resetlogs_change#=(select resetlogs_change# from v$database) group by thread#) al,(select thread# thrd, max(sequence#) lhmax from v$log_history where first_time=(select max(first_time) from v$log_history) group by thread#) lh where al.thrd = lh.thrd;




Ref : Steps to perform for Rolling Forward a Physical Standby Database using RMAN Incremental Backup. (Doc ID 836986.1)

                        ****************************நன்றி****************************




No comments:

Post a Comment

IAS Cache initialization failed

 Today I faced an Issue in R12.2 instance. The solution I followed to overcome the issue is very simple, but they are more than one solution...