

Building a highly available multi node cluster with pacemaker & corosync

Overview¶

Building a highly available multi-node PostgreSQL cluster, using freely available software including Pacemaker, Corsync, Cman and PostgresSQL on CentOS

Infrastructure¶

Vagrant Setup¶

In order to assist in following along with this tutorial, you can use the following Vagrantfile to spool up a cluster environment using CentOS 6.6

For those unfaimilar with please download and install the latest version from vagrantup.

Once installed create a root directory that will house files for this project.

mkdir pgdb_cluster

Navigate into the newly created root directory

cd pgdb_cluster

Download the Vagrantfile noted above into your root directory

wget http://kb.techtaco.org/linux/postgresql/attachments/Vagrantfile

Now to create the three virtual machines needed for this development environment we simply run {varant up}. This will read the Vargrantfile located in the project root directory. This downloads the neede .box (virtual machine) image and creates the needed clones with specified modifications.

vagrant up

Once the virtual machines are provisioned and started you can access them via the { vagrant ssh } command. Replace pgdb1 with the other machines to access them as well, note you must be in the root directory of the project.

vagrant ssh pgdb1

HA Cluster Installation¶

The required are available and included in the base/updates repositories for Centos 6.x.

From my readings and research it is also possible to use heartbeat 3.x with Pacemaker to achieve similar results. I've decided to go with Corosync as its backed by Red Hat and Suse and it looks to have more active development. Not to mention that the Pacemaker projects recommends you should use Corosync.

Cluster Installation¶

Warning! As of RedHat/CentOS 6.4 crmsh is no longer included in the default repositories. If you want to use CRM vs PCS You can include the OpenSuse repositories HERE. More information on the crmsh can be found HERE

In this tutorial we will add the openSUSE repository to our nodes. Though I recommend building or copying these packages into a local repository for more controlled management.

Configure the openSUSE repository. This need to be done on ALL nodes in the cluster.

sudo wget -4 http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo -O /etc/yum.repos.d/network_ha-clustering_Stable.repo

Limit the packages to be installed from the openSUSE repository. We only want the crm shell package and required dependencies. This need to be done on ALL nodes in the cluster.

sudo runuser -l root -c 'echo "includepkgs=crmsh pssh python-pssh" >> /etc/yum.repos.d/network_ha-clustering_Stable.repo'

Now that we have the required repositories configured we need to install the needed packages. This need to be done on ALL nodes in the cluster.

You will see multiple dependencies being pulled in

sudo yum install pacemaker pcs corosync fence-agents crmsh cman ccs

Cluster Configuration¶

The first step is to configure the underlying Cman/Corosync cluster ring communication between the nodes and setup Pacemaker to use Corosync as its communication mechanism.

For secure communication Corosync requires an pre-shared authkey. This shared key must be added to all nodes in the cluster.

To generate the authkey Corosync has a utility corosync-keygen. Invoke this command as the root users to generate the authkey. The key will be generated at /etc/corosync/authkey. You only need to perform this action on one of the nodes in the cluster as we'll copy it to the other nodes

sudo /usr/sbin/corosync-keygen

Hint! Grab a cup of coffee this process takes a while to complete as it pulls from the more secure /dev/random. You don’t have to press anything on the keyboard it will still generate the authkey**

Once the key has been generated copy it to the other nodes in the cluster

sudo scp /etc/corosync/authkey root@pgdb2:/etc/corosync/
sudo scp /etc/corosync/authkey root@pgdb3:/etc/corosync/

In multiple examples and documents on the web they reference using the packmaker corosync plugin by adding a /etc/corosync/service.d/pcmk configure file on each node. This is becoming deprecated and will show in the logs if you enable or use the corosync pacemaker plugin. There is a small but important distinction that I stumbled upon, the pacemaker plugin has never been supported on RHEL systems.

The real issue is that at some point it will no longer be supplied with the packages on RHEL systems. Prior to 6.4 ( Though this is looking to change in 6.5 and above ), pacemaker only had a tech preview status for the plugin and using the CMAN plugin instead.

Reference this wiki article

Disable quorum in order to allow Cman/Corosync to complete startup in a standalone state.

This need to be done on ALL nodes in the cluster.

sudo sed -i.sed "s/.*CMAN_QUORUM_TIMEOUT=.*/CMAN_QUORUM_TIMEOUT=0/g" /etc/sysconfig/cman

Define the cluster, where pg_cluster is the cluster name. This will generate the cluster.conf configuration file.