====== KeepaliveD ======

[[https://docs.linuxfabrik.ch/software/keepalived.html]]

In diesem Beispiel soll KeepaliveD einen MySQL-/MariaDB-Cluster überwachen.
Diese Konfiguration wurde auf einem ''Ubuntu 16.04.3 LTS (xenial)'' durchgeführt und getestet.


===== Installation KeepaliveD =====

  > apt install keepalived


==== Konfiguration ====

Im nächsten Schritt wird der automatische Start beim Booten aktiviert.
  > update-rc.d keepalived defaults


==== sysctl.conf ====

Virtuelle IP erlauben

Um zu erlauben, dass IPs auch auf nicht lokale Schnittstellen zugewiesen werden dürfen, ist ein Eintrag in der /etc/systemctl.conf nötig.
  > echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.conf
  > sysctl -p


==== keepalived.conf ====

Die zentrale Konfiguration von KeepaliveD unter Linux erfolgt über die Datei ''/etc/keepalived/keepalived.conf''.


=== Erreichbarkeit / Service prüfen ===

Zur Konfiguration sollte man sich zunächst Gedanken machen, wie man Nichterreichbarkeit definiert.
Die einfachste Methode ist hierbei der Ping. \\
Allerdings kann ein Server durchaus per Ping erreichbar sein, auch wenn der Dienst (MySQL, Apache, Samba, Mail, DNS, Proxy, etc.) nicht reagiert.

Daher hat sich das Prüfen des entsprechenden Dienstes etabliert.
Die Signalnummer 0 von ''kill'' bzw. ''killall'' hat keinen symbolischen Namen und dient lediglich der Abfrage ob ein Prozess läuft oder nicht. \\
Der erste Teil der Konfigurationsdatei besteht daher aus dem folgenden Block.
<code>
vrrp_script chk_dienst {
	#script "killall -0 mysqld"      # einfachste Form einen Dienst zu prüfen - funktioniert in Verbindung mit KeepaliveD nicht zuverlässig
	script "mysqlshow --defaults-file=/root/.my.cnf >/dev/null"
	interval 2                       # Alle 2 Sekunden prüfen
	weight 2                         # 2 Punkte hinzufügen wenn OK
	fall 2
	rise 2
}
</code>

Eine Zuweisung von Master/Slave erfolgt dabei über die Priorität.
Hierbei gilt ''höher = wichtiger''. \\
Aus diesem Grund setzen wir ''101'' auf dem Master und ''100'' auf dem Backup Server. \\

Diese Abfrage lässt sich natürlich beliebig anpassen.
So könnte man beispielsweise ein eigenes Bash-Skript für komplexere Abfragen konstruieren.
Wichtig ist hierbei nur der Rückgabewert:
  0 = true - Erreichbar
  1 = false - Nicht erreichbar


=== Cluster-IP + Backup-IP ===

//Am Beispiel des Zentralisierungs-Clusters.//

<file bash /usr/local/bin/notify_Instance_MAIP_3306.sh>
#!/bin/bash
# Monitoring information for Check_MK
echo $1 $2 is in $3 state > /var/run/keepalive_Instance_MAIP_3306.state
</file>

<file bash /usr/local/bin/notify_Instance_BUIP_3306.sh>
#!/bin/bash
# Monitoring information for Check_MK
echo $1 $2 is in $3 state > /var/run/keepalive_Instance_BUIP_3306.state
</file>

<file bash /etc/keepalived/keepalived.conf 1>
# http://www.keepalived.org/doc/configuration_synopsis.html
global_defs {
no_email_faults
vrrp_no_swap
}

# Master
vrrp_script chk_dienst_ma {
	script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheimes-passwort"      # MySQL-DB prüfen
        interval 2
        weight 2
	fall 2
	rise 2
}

vrrp_instance MAIP_3306 {
        interface bond0.24
        state BACKUP
        virtual_router_id 101
        priority 98
        advert_int 1

        track_script {
                chk_dienst_ma
        }

        authentication {
                auth_type PASS
                auth_pass geheimes-passwort
        }

        virtual_ipaddress {
                10.10.2.20
        }

	unicast_src_ip 10.10.2.21
	unicast_peer {
		#10.10.2.21
		10.10.2.22
		10.10.2.23
	}
        notify "/usr/local/bin/notify_Instance_MAIP_3306.sh"
}
#
# Backup
vrrp_script chk_dienst_bu {
	script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheimes-passwort"      # MySQL-DB prüfen
        interval 2
        weight 2
	fall 2
	rise 2
}

vrrp_instance BUIP_3306 {
        interface bond0.24
        state BACKUP
        virtual_router_id 102
        priority 2
        advert_int 2

        track_script {
                chk_dienst_bu
        }

        authentication {
                auth_type PASS
                auth_pass geheimes-passwort
        }

        virtual_ipaddress {
                10.10.2.24
        }

	unicast_src_ip 10.10.2.21
	unicast_peer {
		#10.10.2.21
		10.10.2.22
		10.10.2.23
	}
        notify "/usr/local/bin/notify_Instance_BUIP_3306.sh"
}
</file>

<file bash /etc/keepalived/keepalived.conf 2>
# http://www.keepalived.org/doc/configuration_synopsis.html
global_defs {
no_email_faults
vrrp_no_swap
}

# Master
vrrp_script chk_dienst_ma {
        script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheimes-passwort"      # MySQL-DB prüfen
        interval 2
        weight 2
	fall 2
	rise 2
}

vrrp_instance MAIP_3306 {
        interface bond0.24
        state BACKUP
        virtual_router_id 101
        priority 97
        advert_int 1

        track_script {
                chk_dienst_ma
        }

        authentication {
                auth_type PASS
                auth_pass geheimes-passwort
        }

        virtual_ipaddress {
                10.10.2.20
        }

        unicast_src_ip 10.10.2.22
        unicast_peer {
                10.10.2.21
                #10.10.2.22
                10.10.2.23
        }
        notify "/usr/local/bin/notify_Instance_MAIP_3306.sh"
}
#
# Backup
vrrp_script chk_dienst_bu {
        script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheimes-passwort"      # MySQL-DB prüfen
        interval 2
        weight 2
	fall 2
	rise 2
}

vrrp_instance BUIP_3306 {
        interface bond0.24
        state BACKUP
        virtual_router_id 102
        priority 3
        advert_int 2

        track_script {
                chk_dienst_bu
        }

        authentication {
                auth_type PASS
                auth_pass geheimes-passwort
        }

        virtual_ipaddress {
                10.10.2.24
        }

        unicast_src_ip 10.10.2.22
        unicast_peer {
                10.10.2.21
                #10.10.2.22
                10.10.2.23
        }
	notify "/usr/local/bin/notify_Instance_BUIP_3306.sh"
}
</file>

<file bash /etc/keepalived/keepalived.conf 3>
# http://www.keepalived.org/doc/configuration_synopsis.html
global_defs {
no_email_faults
vrrp_no_swap
}

# Master
vrrp_script chk_dienst_ma {
        script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheimes-passwort"      # MySQL-DB prüfen
        interval 2
        weight 2
	fall 2
	rise 2
}

vrrp_instance MAIP_3306 {
        interface bond0.24
        state BACKUP
        virtual_router_id 101
        priority 96
        advert_int 1

        track_script {
                chk_dienst_ma
        }

        authentication {
                auth_type PASS
                auth_pass geheimes-passwort
        }

        virtual_ipaddress {
                10.10.2.20
        }

        unicast_src_ip 10.130.2.23
        unicast_peer {
                10.10.2.21
                10.10.2.22
                #10.10.2.23
        }
        notify "/usr/local/bin/notify_Instance_MAIP_3306.sh"
}
#
# Backup
vrrp_script chk_dienst_bu {
        script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheimes-passwort"      # MySQL-DB prüfen
        interval 2
        weight 2
	fall 2
	rise 2
}

vrrp_instance BUIP_3306 {
        interface bond0.24
        state BACKUP
        virtual_router_id 102
        priority 4
        advert_int 2

        track_script {
                chk_dienst_bu
        }

        authentication {
                auth_type PASS
                auth_pass geheimes-passwort
        }

        virtual_ipaddress {
                10.10.2.24
        }

        unicast_src_ip 10.10.2.23
        unicast_peer {
                10.10.2.21
                10.10.2.22
                #10.10.2.23
        }
	notify "/usr/local/bin/notify_Instance_BUIP_3306.sh"
}
</file>


==== Cluster-IP-Management mit KeepaliveD auf einem MySQL-Cluster ====

Zum Beispiel (für Galera/WSREP):
<file c /root/bin/check_db.sh>
#!/bin/bash
 
#
# DB-Check
#
# /root/bin/check_db.sh $(hostname -s) 3306 root geheimespasswort
#
 
unset node_response
mysql_host="${1}";
mysql_port="${2}";
mysql_user="${3}";
mysql_pass="${4}";
 
OPTION="/tmp/check_db.cfg"
touch ${OPTION}
chmod 0600 ${OPTION}
echo "
[client]
host     = localhost
port     = ${mysql_port}
user     = ${mysql_user}
password = "${mysql_pass}"
" > ${OPTION}
 
node_response=$(echo "SHOW GLOBAL VARIABLES LIKE 'hostname';" | mysql --defaults-file=${OPTION} -N | awk '{ print $2 }');
wsrep_state=$(echo "SHOW STATUS LIKE 'wsrep_local_state_comment';" | mysql --defaults-file=${OPTION} -N | awk '{ print $2 }');
rm -f ${OPTION}
 
echo "${mysql_host} ? ${node_response} / ${wsrep_state}" > /tmp/${mysql_host}ma.txt
if [ "${wsrep_local_state}" = "4" ]
then
        if [ "${node_response}" == "${mysql_host}" ]
        then
#               echo "Hostname matched"
                exit 0;
        else
#               echo "Hostname not matched"
                exit 1;
        fi
else
#       echo "Knoten ist nicht synchron"
        exit 1;
fi
</file>

Zum Beispiel (für GTID mit Kanal):
<file bash /root/bin/check_db.sh>
#!/bin/bash
 
#==============================================================================#
#
# DB-Bakup-IP-Check
#
# /root/bin/check_db.sh $(hostname -s) 3306 root geheimespasswort
#
#==============================================================================#
 
VERSION="v2019021900"
 
 
if [ "x${3}" = x ] ; then
        echo "${0} [Hostname] [Port] [User]"
        echo "${0} [Hostname] [Port] [User] [Passwort]"
        echo "${0} \$(hostname -s) 3306 dbuser geheimespasswort"
        exit 10
else
        mysql_host="${1}"
fi
 
#==============================================================================#
 
#------------------------------------------------------------------------------#
 
ROOT_FILE="/root/.my.cnf"
ROOT_U="$(cat "${ROOT_FILE}" | awk '/^user/{print $3}')"
ROOT_P="$(cat "${ROOT_FILE}" | awk '/^password/{print $3}')"

STECKER="/var/run/mysqld/mysqld.sock"
 
#==============================================================================#
 
#------------------------------------------------------------------------------#
### erst muss die Master-IP hoch gefahren sein
 
### nur für Backup-IP aktivieren
#sleep 3
 
#------------------------------------------------------------------------------#
### wenn dieses DBMS nicht läuft, dann darf die IP hier nicht aktiviert werden
 
AUSGABE="$(echo "SHOW SLAVE STATUS \G;" | mysql -S ${STECKER} 2>/dev/null || echo Aus)"
 
### immer aktiviert
if [ "${AUSGABE}" = "Aus" ] ; then
        echo "DB is not Running"
        rm_defaults-file
        exit 1
else
        CHANNEL_NAMEN="$(echo "${AUSGABE}" | awk '/Channel_Name:/{print $NF}')"
fi
 
#------------------------------------------------------------------------------#
### Kontrolliert, ob dieses MySQL-DBMS auf dem richtigen Host läuft
 
node_response="$(echo "SHOW GLOBAL VARIABLES LIKE 'hostname';" | mysql -S ${STECKER} -N 2>/dev/null | awk '{ print $2 }')"
 
if [ "${node_response}" != "${mysql_host}" ] ; then
        echo "Hostname not matched: ${node_response}/${mysql_host}"
        rm_defaults-file
        exit 1;
fi
 
#------------------------------------------------------------------------------#
### jeder Kanal muss separat überprüft werden
### Fehler sind hier nur relevant, wenn kein Kanal vernünftig läuft
 
STATUS_GUT="$(for KANAL in ${CHANNEL_NAMEN}
do
        #----------------------------------------------------------------------#
        ### Den Status aus diesem Kanal auslesen
 
        SLAVE_STATUS="$(echo "SHOW SLAVE STATUS FOR CHANNEL '${KANAL}' \G;" | mysql -S ${STECKER} -t 2>/dev/null)"
 
        MASTER_HOST="$(echo "${SLAVE_STATUS}" | fgrep "Master_Host:" | awk '{print $NF}')"
        if [ "${MASTER_HOST}" != "${node_response}" ] ; then
                MASTER_PORT="$(echo "${SLAVE_STATUS}" | fgrep "Master_Port:" | awk '{print $NF}')"
                SLAVE_IO_RUNNING="$(echo "${SLAVE_STATUS}" | fgrep "Slave_IO_Running:" | awk '{print $NF}')"
                SLAVE_SQL_RUNNING="$(echo "${SLAVE_STATUS}" | fgrep "Slave_SQL_Running:" | awk '{print $NF}')"
                SECONDS_BEHIND_MASTER="$(echo "${SLAVE_STATUS}" | fgrep "Seconds_Behind_Master:" | awk '{print $NF}')"
 
                #--------------------------------------------------------------#
                ### Kontrolle ob auf den richtigen Port verbunden wird
 
                unset MPORT
                if [ "${2}" -eq "${MASTER_PORT}" ] ; then
                        echo "Master_Port: ${MASTER_PORT}" > /tmp/${2}ma.txt
                        MPORT="Port"
                fi
 
                #--------------------------------------------------------------#
                ### Kontrolliert, ob dieser Knoten mit dem Cluster verbunden ist
 
                unset RUNNING
                IO_RUNNING="$(echo "${SLAVE_IO_RUNNING}" | fgrep Yes)"
                if [ "${IO_RUNNING}" = "Yes" ] ; then
                        SQL_RUNNING="$(echo "${SLAVE_SQL_RUNNING}" | fgrep Yes)"
                        if [ "${SQL_RUNNING}" = "Yes" ] ; then
                                echo "Running: ${IO_RUNNING}/${SQL_RUNNING}" >> /tmp/${2}ma.txt
                                RUNNING="Running"
                        fi
                fi
 
                #--------------------------------------------------------------#
                #-# Diese Bedingung muss auf dem Backup-Slave nicht zwingend erfüllt sein
                ### Kontrolliert, ob dieser Knoten mit dem Cluster in Sync ist
 
                unset SEKUNDEN
                if [ "${SECONDS_BEHIND_MASTER}" = "NULL" ] ; then
                        echo "Seconds_Behind_Master: ${SECONDS_BEHIND_MASTER}" >> /tmp/${2}ma.txt
                elif [ "${SECONDS_BEHIND_MASTER}" -eq "0" ] ; then
                        echo "Seconds_Behind_Master: ${SECONDS_BEHIND_MASTER}" >> /tmp/${2}ma.txt
                        SEKUNDEN="Seconds"
                fi
 
                echo "${MPORT} ${RUNNING} ${SEKUNDEN}"
                #--------------------------------------------------------------#
        fi
        #----------------------------------------------------------------------#
done | fgrep "Port Running")"
 
rm_defaults-file
 
if [ "x${STATUS_GUT}" = "x" ] ; then
        echo "${2}ma $(date +'%F %T')" >> /tmp/KO.txt
        exit 1
fi
</file>


=== Auf jeden Datenbank-Knoten muß eine individuelle KeepaliveD-CFG abgelegt werden ===

1. Knoten:
<file c /etc/keepalived/keepalived.conf>
vrrp_script chk_dienst {
        #script "/usr/bin/pgrep mysqld"                                       # MySQL-DB prüfen
        #script "killall -0 /usr/sbin/mysqld"                                 # MySQL-DB prüfen
        script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheim"   # MySQL-DB prüfen
        interval 2                      # Alle 2 Sekunden prüfen
        weight 2                        # 2 Punkte hinzufügen wenn OK
	fall 2
	rise 2
}

vrrp_instance meine_mysql_db {
        interface eth0                  # Zu überwachendes Interface
        state BACKUP
        priority 100
        virtual_router_id 123           # ID der Route
        virtual_ipaddress {
                10.10.10.10             # Die virtuelle IP Adresse
        }
        track_script {
                chk_dienst
        }

        unicast_src_ip 10.10.10.97      # eigene IP (Knoten 1)
        unicast_peer {
                10.10.10.98             # IP von Knoten 2
                10.10.10.99             # IP von Knoten 3
        }
        authentication {
                auth_type PASS
                auth_pass geheim_1234
        }

        # for ANY state transition.
        # "notify" script is called AFTER the
        # notify_* script(s) and is executed
        # with 3 arguments provided by keepalived
        # (ie dont include parameters in the notify line).
        # arguments
        # $1 = "GROUP"|"INSTANCE"
        # $2 = name of group or instance
        # $3 = target state of transition
        #     ("MASTER"|"BACKUP"|"FAULT")
        notify /root/bin/keepalived_notify.sh
}
</file>

2. Knoten:
<file c /etc/keepalived/keepalived.conf>
vrrp_script chk_dienst {
        #script "/usr/bin/pgrep mysqld"                                       # MySQL-DB prüfen
        #script "killall -0 /usr/sbin/mysqld"                                 # MySQL-DB prüfen
        script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheim"   # MySQL-DB prüfen
        interval 2                      # Alle 2 Sekunden prüfen
        weight 2                        # 2 Punkte hinzufügen wenn OK
	fall 2
	rise 2
}

vrrp_instance meine_mysql_db {
        interface eth0                  # Zu überwachendes Interface
        state BACKUP
        priority 100
        virtual_router_id 123           # ID der Route
        virtual_ipaddress {
                10.10.10.10             # Die virtuelle IP Adresse
        }
        track_script {
                chk_dienst
        }

        unicast_src_ip 10.10.10.98      # eigene IP (Knoten 2)
        unicast_peer {
                10.10.10.97             # IP von Knoten 1
                10.10.10.99             # IP von Knoten 3
        }
        authentication {
                auth_type PASS
                auth_pass geheim_1234
        }

        # for ANY state transition.
        # "notify" script is called AFTER the
        # notify_* script(s) and is executed
        # with 3 arguments provided by keepalived
        # (ie dont include parameters in the notify line).
        # arguments
        # $1 = "GROUP"|"INSTANCE"
        # $2 = name of group or instance
        # $3 = target state of transition
        #     ("MASTER"|"BACKUP"|"FAULT")
        notify /root/bin/keepalived_notify.sh
}
</file>

3. Knoten:
<file c /etc/keepalived/keepalived.conf>
vrrp_script chk_dienst {
        #script "/usr/bin/pgrep mysqld"                                       # MySQL-DB prüfen
        #script "killall -0 /usr/sbin/mysqld"                                 # MySQL-DB prüfen
        script "/root/bin/check_db.sh $(hostname -s) 3306 checkuser geheim"   # MySQL-DB prüfen
        interval 2                      # Alle 2 Sekunden prüfen
        weight 2                        # 2 Punkte hinzufügen wenn OK
	fall 2
	rise 2
}

vrrp_instance meine_mysql_db {
        interface eth0                  # Zu überwachendes Interface
        state BACKUP
        priority 100
        virtual_router_id 123           # ID der Route
        virtual_ipaddress {
                10.10.10.10             # Die virtuelle IP Adresse
        }
        track_script {
                chk_dienst
        }

        unicast_src_ip 10.10.10.99      # eigene IP (Knoten 3)
        unicast_peer {
                10.10.10.97             # IP von Knoten 1
                10.10.10.98             # IP von Knoten 2
        }
        authentication {
                auth_type PASS
                auth_pass geheim_1234
        }

        # for ANY state transition.
        # "notify" script is called AFTER the
        # notify_* script(s) and is executed
        # with 3 arguments provided by keepalived
        # (ie dont include parameters in the notify line).
        # arguments
        # $1 = "GROUP"|"INSTANCE"
        # $2 = name of group or instance
        # $3 = target state of transition
        #     ("MASTER"|"BACKUP"|"FAULT")
        notify /root/bin/keepalived_notify.sh
}
</file>

<file bash /root/bin/keepalived_notify.sh>
#!/bin/bash

# /root/bin/keepalived_notify.sh

echo $1 $2 is in $3 state > /var/run/keepalive.$1.$2.state
</file>

  > cat /var/run/keepalive.INSTANCE.meine_mysql_db.state
  INSTANCE meine_mysql_db is in MASTER state

  > chmod 0640 /etc/keepalived/keepalived.conf

//** Keepalived starten **//

Die Konfiguration ist nun abgeschlossen.
Um Keepalived zu starten und damit auch die virtuelle IP im Netzwerk verfügbar zu machen,
genügt es den Dienst neu zu starten.
  > service keepalived restart
  > service keepalived status