Recovery - CSU357 - Shoolini University

Recovery

1. Recovery in DBMS

Recovery in Database Management Systems (DBMS) refers to the process of restoring the database to a correct state in the event of a failure. This ensures data integrity, consistency, and availability.

1.1 Types of Failures

Database systems may encounter various types of failures:

1.2 Recovery Techniques

Several techniques are employed to recover from failures:

1.2.1 Log-Based Recovery in short

Log-based recovery is a common method used to ensure data integrity. The log contains entries for every transaction, including the following information:

-- Creating a log table
CREATE TABLE log (
    transaction_id INT,
    operation VARCHAR(50),
    old_value VARCHAR(255),
    new_value VARCHAR(255)
);

-- Inserting a log entry
INSERT INTO log (transaction_id, operation, old_value, new_value) VALUES (1, 'UPDATE', '100', '200');
1.2.2 Shadow Paging in short

Shadow paging is another recovery method where a shadow copy of the database is maintained. The process involves:

If a failure occurs, the shadow copy is used to restore the database.

-- Example process of shadow paging
BEGIN TRANSACTION;
CREATE SHADOW COPY;
-- Perform operations
COMMIT TRANSACTION;
1.2.3 Checkpointing in short

Checkpointing involves periodically saving the database state, which helps in minimizing the recovery time. The process includes:

In the event of a failure, recovery can start from the last checkpoint, reducing the amount of log data to be processed.

-- Example of checkpointing
CHECKPOINT;
-- Continue with database operations

1.3 Log-Based Recovery in Detail

Log-based recovery is a fundamental technique in DBMS to ensure the durability and atomicity of transactions. This method relies on a log file, which records all changes made to the database. The log contains sufficient information to undo or redo transactions if a failure occurs.

1.3.1 Components of Log-Based Recovery

Log-based recovery involves several key components:

1.3.2 Structure of a Log Record

A log record typically includes the following information:

-- Example log record
INSERT INTO log (transaction_id, lsn, operation, bfim, afim, timestamp)
VALUES (1, 1001, 'UPDATE', '100', '200', '2024-05-10 10:00:00');
1.3.3 Logging Process

When a transaction modifies the database, the following steps occur:

  1. Before Image Recording: The old value of the data is recorded in the log before the modification.
  2. After Image Recording: The new value of the data is recorded in the log after the modification.
  3. Write-Ahead Logging (WAL): Ensures that log records are written to the log file before the actual data is written to the database.
1.3.4 Recovery Process

In the event of a failure, the recovery process involves two main phases: redo and undo.

1.3.4.1 Redo Phase

The redo phase re-applies all changes from committed transactions to ensure all modifications are present in the database. The steps include:

  1. Scanning the log from the beginning or the last checkpoint.
  2. Applying all operations to the database to redo the transactions.
-- Redo operation example
-- Applying changes from log to database
UPDATE table_name SET column_name = '200' WHERE column_name = '100';
1.3.4.2 Undo Phase

The undo phase reverses changes from uncommitted transactions to maintain consistency. The steps include:

  1. Scanning the log backward from the point of failure.
  2. Undoing all operations of uncommitted transactions.
-- Undo operation example
-- Reversing changes from uncommitted transactions
UPDATE table_name SET column_name = '100' WHERE column_name = '200';
1.3.5 Write-Ahead Logging (WAL) Protocol

The WAL protocol is crucial for log-based recovery, ensuring data integrity. It mandates that:

This protocol ensures that no data modifications are lost, as log records provide a reliable means to redo or undo operations.

-- Example of enforcing WAL protocol
BEGIN TRANSACTION;
INSERT INTO log (transaction_id, lsn, operation, bfim, afim, timestamp) VALUES (2, 1002, 'UPDATE', '300', '400', CURRENT_TIMESTAMP);
UPDATE table_name SET column_name = '400' WHERE column_name = '300';
COMMIT TRANSACTION;

1.4 Shadow Paging in Detail

Shadow paging is a recovery technique used to ensure database consistency without the need for logs. It maintains two copies of the database: the current page table and the shadow page table. Modifications are made to the current page table, and the shadow page table remains unchanged until a transaction is committed.

1.4.1 Components of Shadow Paging

Shadow paging involves the following components:

1.4.2 Process of Shadow Paging

The process of shadow paging includes several key steps:

  1. Initialization: At the beginning of a transaction, the shadow page table is created as a copy of the current page table.
  2. Modification: When a transaction modifies a page, a new copy of the page is created, and the current page table is updated to point to this new page. The shadow page table remains unchanged.
  3. Commit: Once the transaction is successfully completed, the shadow page table is updated to match the current page table, and the changes become permanent.
  4. Abort: If the transaction fails, the current page table is discarded, and the shadow page table is used to restore the database to its original state.
1.4.3 Advantages and Disadvantages of Shadow Paging

Understanding the pros and cons of shadow paging helps in evaluating its effectiveness:

Advantages
Disadvantages
1.4.4 Example of Shadow Paging

Consider a scenario where a transaction updates a page in the database:

  1. Initialize the shadow page table as a copy of the current page table.
  2. When a page is updated, create a new copy of the page and update the current page table to point to this new page.
  3. Commit the transaction by updating the shadow page table to match the current page table.
  4. If the transaction fails, discard the current page table and use the shadow page table to restore the database.
-- Example process of shadow paging
BEGIN TRANSACTION;
-- Assume page 1 is updated
UPDATE current_page_table SET page_pointer = new_page_pointer WHERE page_id = 1;
-- If transaction is successful
COMMIT;
-- Update shadow page table to match current page table
UPDATE shadow_page_table SET page_pointer = new_page_pointer WHERE page_id = 1;
1.4.5 Checkpointing in Shadow Paging

Checkpointing in shadow paging involves creating a consistent snapshot of the database at regular intervals. This reduces the recovery time by providing a known good state from which to start.

  1. Flush all dirty pages from the buffer to disk.
  2. Update the shadow page table to reflect the current state of the database.
  3. Write a checkpoint record to indicate the completion of the checkpoint.
-- Checkpointing example
CHECKPOINT;
-- Flush dirty pages to disk
-- Update shadow page table
UPDATE shadow_page_table SET ...;
-- Write checkpoint record
INSERT INTO checkpoint_log (timestamp) VALUES (CURRENT_TIMESTAMP);
1.4.6 Implementation in SQL

Here is an example of implementing shadow paging in SQL:

-- Create shadow page table as a copy of the current page table
CREATE TABLE shadow_page_table AS SELECT * FROM current_page_table;

-- Begin transaction
BEGIN TRANSACTION;

-- Update current page table when a page is modified
UPDATE current_page_table SET page_pointer = new_page_pointer WHERE page_id = 1;

-- Commit transaction
COMMIT;

-- Update shadow page table upon successful commit
UPDATE shadow_page_table SET page_pointer = new_page_pointer WHERE page_id = 1;

-- If transaction fails, discard current page table changes
ROLLBACK;

1.5 Checkpointing in Detail

Checkpointing is a technique used in DBMS to reduce the time required for recovery by periodically saving a consistent state of the database to disk. This helps in minimizing the amount of log data that needs to be processed during recovery, thereby speeding up the process.

1.5.1 Components of Checkpointing

Checkpointing involves the following components:

1.5.2 Process of Checkpointing

The process of checkpointing involves several key steps:

  1. Flushing Dirty Pages: All modified (dirty) pages in memory are written to disk.
  2. Writing the Checkpoint Record: A checkpoint record is written to the log, indicating the completion of the checkpoint.
  3. Updating the Checkpoint Log: The checkpoint log is updated to include the new checkpoint.
-- Example of writing a checkpoint
CHECKPOINT;
-- Flush dirty pages to disk
-- Write checkpoint record
INSERT INTO checkpoint_log (timestamp) VALUES (CURRENT_TIMESTAMP);
1.5.3 Types of Checkpoints

There are several types of checkpoints, each with its own use case:

1.5.3.1 Consistent Checkpoints

Ensures that the database is in a consistent state. All transactions are completed, and no transactions are in progress.

1.5.3.2 Fuzzy Checkpoints

Allows some transactions to be in progress while the checkpoint is taken. Only dirty pages are flushed to disk.

1.5.3.3 Incremental Checkpoints

Only the pages modified since the last checkpoint are written to disk, reducing the amount of data written during each checkpoint.

1.5.4 Advantages and Disadvantages of Checkpointing

Understanding the benefits and drawbacks of checkpointing helps in evaluating its effectiveness:

Advantages
Disadvantages
1.5.5 Implementation in SQL

Here is an example of implementing checkpointing in SQL:

-- Creating a checkpoint log table
CREATE TABLE checkpoint_log (
    checkpoint_id INT AUTO_INCREMENT PRIMARY KEY,
    timestamp TIMESTAMP
);

-- Function to perform checkpointing
DELIMITER //
CREATE PROCEDURE perform_checkpoint()
BEGIN
    -- Flush dirty pages to disk
    FLUSH TABLES;
    
    -- Write checkpoint record
    INSERT INTO checkpoint_log (timestamp) VALUES (CURRENT_TIMESTAMP);
    
    -- Optionally, update other metadata or perform additional tasks
END //
DELIMITER ;

-- Execute checkpoint procedure
CALL perform_checkpoint();
1.5.6 Checkpointing in Recovery

During recovery, checkpoints are used to reduce the amount of log data that needs to be processed:

  1. Identify the last checkpoint from the checkpoint log.
  2. Start the recovery process from this checkpoint, reducing the amount of data to be read and applied.
-- Example of recovery process using checkpoint
-- Identify the last checkpoint
SELECT MAX(timestamp) INTO @last_checkpoint FROM checkpoint_log;

-- Start recovery from the last checkpoint
-- Apply all transactions from this point onwards
SELECT * FROM transaction_log WHERE timestamp >= @last_checkpoint;