The cart is empty

In today's digital world, the resilience of systems to failures is critical. One approach to ensure high availability and data storage resilience is by utilizing DRBD technology on the CentOS operating system. This article provides a detailed overview of steps and best practices for implementing a robust solution using DRBD, focused on maximizing failure resilience.

Introduction to DRBD

DRBD is a Linux software tool that enables the creation of replicated block-level storage between servers. It operates on the principle of real-time data mirroring between nodes, ensuring that in the event of a node failure, another node can take over its function without data loss. DRBD can be used in conjunction with other high availability technologies such as Corosync or Pacemaker, providing a robust solution for critical applications.

Prerequisites for Installation

Before initiating DRBD implementation, it is essential for both servers (nodes) to have CentOS installed and to be interconnected via a network. Additionally, setting static IP addresses for both nodes and ensuring sufficient disk space for data replication is recommended.

Installation and Configuration of DRBD

  1. DRBD Installation: Begin by adding the DRBD package repository and installing DRBD on both nodes using the following commands:

    sudo yum install -y epel-release
    sudo yum install -y drbd90-utils kmod-drbd90
    
  2. DRBD Configuration: After installation, create a configuration file for DRBD, usually located in /etc/drbd.d/. The configuration file defines data sources to be replicated and parameters for synchronization. A basic configuration may look like this:

    resource r0 {
        protocol C;
        on node1 {
            device /dev/drbd0;
            disk /dev/sdb;
            address 192.168.1.1:7788;
            meta-disk internal;
        }
        on node2 {
            device /dev/drbd0;
            disk /dev/sdc;
            address 192.168.1.2:7788;
            meta-disk internal;
        }
    }
    
  3. Initialization and Start of DRBD: After setting up configuration files, initialize metadata for the DRBD device and start DRBD:

    sudo drbdadm create-md r0
    sudo systemctl start drbd
    sudo drbdadm up r0
    
  4. Data Synchronization: Before using DRBD, ensure that data between nodes are fully synchronized. This can be achieved using the following command:

    sudo drbdadm primary --force r0
    

 

  1. This command sets one node as primary and initiates the data synchronization process.

Integration with High Availability Technologies

To ensure automatic service takeover in case of failure, it is advisable to combine DRBD with technologies such as Corosync and Pacemaker.

  1. Corosync and Pacemaker Installation: Install Corosync and Pacemaker on both nodes:

    sudo drbdadm primary --force r0
    
  2. Cluster Configuration: Use the pcs tool to initialize and configure the cluster. First, set the password for the hacluster user on both nodes, then create the cluster:

    sudo pcs cluster auth node1 node2 -u hacluster -p <password>
    sudo pcs cluster setup --name my_cluster node1 node2
    sudo pcs cluster start --all
    
  3. DRBD Configuration as a Cluster Resource: Now, add DRBD as a resource to the cluster so that Pacemaker can manage its availability:

    sudo pcs resource create drbd_resource ocf:linbit:drbd \
      drbd_resource=r0 op monitor interval=20s
    sudo pcs resource master drbd_master drbd_resource \
      master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
    
  4. Automatic Service Takeover: Finally, configure the cluster to automatically take over services in case of node failure. This can be achieved by adding additional resources to the cluster that depend on DRBD availability:

    sudo pcs resource create my_service systemd:my_service \
      op monitor interval=30s
    sudo pcs constraint colocation add my_service with Master drbd_master INFINITY
    sudo pcs constraint order promote drbd_master then start my_service
    

 

Implementing DRBD on CentOS with integration with Corosync and Pacemaker provides a robust solution for failure resilience. Through block-level data replication and automatic service takeover in case of failure, high availability of critical applications can be ensured. The key to success lies in careful configuration and testing of the cluster solution to ensure that all components cooperate correctly and are capable of responding to potential outages.