datenbank:ha-postgresql-cluster
Unterschiede
Hier werden die Unterschiede zwischen zwei Versionen angezeigt.
| Beide Seiten der vorigen RevisionVorhergehende ÜberarbeitungNächste Überarbeitung | Vorhergehende Überarbeitung | ||
| datenbank:ha-postgresql-cluster [2026-05-12 18:49:02] – manfred | datenbank:ha-postgresql-cluster [2026-05-29 10:09:43] (aktuell) – [Tests] manfred | ||
|---|---|---|---|
| Zeile 1: | Zeile 1: | ||
| ====== HA-PostgreSQL-Cluster ====== | ====== HA-PostgreSQL-Cluster ====== | ||
| + | |||
| + | siehe auch: **[[:: | ||
| Zeile 6: | Zeile 8: | ||
| //Es soll ein hochverfügbarer PostgreSQL‑Cluster mit Patroni, EtcE, HAProxy und KeepaliveD auf drei Systemen installiert werden.// | //Es soll ein hochverfügbarer PostgreSQL‑Cluster mit Patroni, EtcE, HAProxy und KeepaliveD auf drei Systemen installiert werden.// | ||
| - | Wichtige Unterschiede zwischen diesem HA-PostgreSQL-Cluster mit 3 Systemen, gegenüber einem HA-PostgreSQL-Cluster mit 5 Systemen sind, dass hier alle Pakete auf allen 3 Systemen installiert werden müssen. Und weil PostgreSQL und HAProxy auf dem selben Host laufen, muß der Datenbank-Zugriff über den Cluster-Zugang (HAProxy) auf einen anderen Port laufen. In diesem Fall laufen die PostgreSQL-Instanzen | + | Wichtige Unterschiede zwischen diesem HA-PostgreSQL-Cluster mit 3 Systemen, gegenüber einem HA-PostgreSQL-Cluster mit 5 Systemen sind, dass hier alle Pakete auf allen 3 Systemen installiert werden müssen. Und weil PostgreSQL und HAProxy auf dem selben Host laufen, muß der Datenbank-Zugriff über den Cluster-Zugang (HAProxy) auf einen anderen Port laufen. |
| - | Weiterhin gibt es in diesem Aufbau noch einen schreibgeschützten Port ('' | + | In diesem Fall laufen die PostgreSQL-Instanzen |
| In einem HA-PostgreSQL-Cluster mit 5 Systemen laufen EtcE, HAProxy und KeepaliveD auf 2 separate Systeme und dann können PostgreSQL und HAProxy auf dem gleichen Port laufen. | In einem HA-PostgreSQL-Cluster mit 5 Systemen laufen EtcE, HAProxy und KeepaliveD auf 2 separate Systeme und dann können PostgreSQL und HAProxy auf dem gleichen Port laufen. | ||
| Zeile 19: | Zeile 21: | ||
| > apt install vim screen mc csync2 links python3 python3-psycopg2 jq curl wget net-tools etcd-server etcd-client haproxy keepalived postgresql postgresql-contrib patroni | > apt install vim screen mc csync2 links python3 python3-psycopg2 jq curl wget net-tools etcd-server etcd-client haproxy keepalived postgresql postgresql-contrib patroni | ||
| > echo " | > echo " | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== PG-DataDir mit ZFS ==== | ||
| + | |||
| + | <code bash DataDir im RAID-1> | ||
| + | > apt install zfsutils-linux | ||
| + | |||
| + | > zpool create -m / | ||
| + | > zpool list | ||
| + | NAME | ||
| + | pg_datadir | ||
| + | |||
| + | > zpool status | ||
| + | pool: pg_datadir | ||
| + | | ||
| + | config: | ||
| + | |||
| + | NAME | ||
| + | pg_datadir | ||
| + | mirror-0 | ||
| + | nvme2n1 | ||
| + | nvme3n1 | ||
| + | |||
| + | errors: No known data errors | ||
| + | |||
| + | > zfs set compression=lz4 | ||
| + | > zfs set recordsize=16K | ||
| + | > zfs set logbias=latency | ||
| + | > zfs set atime=off | ||
| + | |||
| + | > mkdir -p / | ||
| + | > chown -R postgres: | ||
| + | > chmod 700 / | ||
| + | |||
| + | > mount | grep postgresql | ||
| + | |||
| + | > zfs get mountpoint, | ||
| + | NAME PROPERTY | ||
| + | pg_datadir | ||
| + | pg_datadir | ||
| + | pg_datadir | ||
| + | pg_datadir | ||
| + | pg_datadir | ||
| + | |||
| + | > df -h | ||
| + | Filesystem | ||
| + | ... | ||
| + | pg_datadir | ||
| </ | </ | ||
| Zeile 29: | Zeile 80: | ||
| group Loadbalancer #group name, we can have multiple groups | group Loadbalancer #group name, we can have multiple groups | ||
| { | { | ||
| - | host pg-knoten-01; #master server | + | host FRA2A-MEDOS04; #master server |
| - | host pg-knoten-02; #slave server | + | host FRA2B-MEDOS04; #slave server |
| - | host pg-knoten-03; #slave server | + | host FRA2C-MEDOS04; #slave server |
| key / | key / | ||
| Zeile 138: | Zeile 189: | ||
| <code text Starten> | <code text Starten> | ||
| - | systemctl restart etcd | + | > systemctl restart etcd |
| - | systemctl enable etcd | + | > systemctl enable etcd |
| </ | </ | ||
| Zeile 152: | Zeile 203: | ||
| === Hinweis === | === Hinweis === | ||
| - | **//Solange der Cluster unhealthy ist, sind Verwaltungsoperationen wie etcdctl member add absichtlich verboten.// | + | **//Solange der Cluster unhealthy ist, sind Verwaltungsoperationen wie '' |
| <code bash> | <code bash> | ||
| Zeile 290: | Zeile 341: | ||
| > service haproxy restart | > service haproxy restart | ||
| ... | ... | ||
| - | May 08 17: | + | May 08 17: |
| - | May 08 17: | + | May 08 17: |
| - | May 08 17: | + | May 08 17: |
| - | May 08 17: | + | May 08 17: |
| - | May 08 17: | + | May 08 17: |
| - | May 08 17: | + | May 08 17: |
| > ss -antp | grep -F LISTEN | grep -F haproxy | > ss -antp | grep -F LISTEN | grep -F haproxy | ||
| Zeile 377: | Zeile 428: | ||
| vrrp_no_swap | vrrp_no_swap | ||
| } | } | ||
| - | + | ||
| vrrp_script chk_dienst_vip { | vrrp_script chk_dienst_vip { | ||
| - | script "/ | + | |
| interval 2 | interval 2 | ||
| - | weight | + | weight |
| - | fall 2 | + | fall 2 |
| } | } | ||
| - | + | ||
| vrrp_instance VIP { | vrrp_instance VIP { | ||
| - | interface bond2 | + | |
| - | | + | |
| virtual_router_id 203 | virtual_router_id 203 | ||
| + | state BACKUP | ||
| priority 100 | priority 100 | ||
| + | #nopreempt | ||
| advert_int 1 | advert_int 1 | ||
| - | + | ||
| track_script { | track_script { | ||
| chk_dienst_vip | chk_dienst_vip | ||
| } | } | ||
| - | + | ||
| authentication { | authentication { | ||
| auth_type PASS | auth_type PASS | ||
| auth_pass RaBoo9as | auth_pass RaBoo9as | ||
| } | } | ||
| - | + | ||
| virtual_ipaddress { | virtual_ipaddress { | ||
| 10.145.43.30/ | 10.145.43.30/ | ||
| } | } | ||
| - | + | ||
| - | unicast_src_ip 10.145.43.31 | + | # |
| - | unicast_peer { | + | |
| - | # | + | |
| - | 10.145.43.32 | + | unicast_peer { |
| - | 10.145.43.33 | + | # |
| - | } | + | 10.145.43.32 |
| + | 10.145.43.33 | ||
| + | } | ||
| } | } | ||
| </ | </ | ||
| Zeile 421: | Zeile 476: | ||
| vrrp_no_swap | vrrp_no_swap | ||
| } | } | ||
| - | + | ||
| vrrp_script chk_dienst_vip { | vrrp_script chk_dienst_vip { | ||
| - | script "/ | + | |
| interval 2 | interval 2 | ||
| - | weight | + | weight |
| - | fall 2 | + | fall 2 |
| } | } | ||
| - | + | ||
| vrrp_instance VIP { | vrrp_instance VIP { | ||
| - | interface bond2 | + | |
| - | | + | |
| virtual_router_id 203 | virtual_router_id 203 | ||
| + | state BACKUP | ||
| priority 100 | priority 100 | ||
| + | #nopreempt | ||
| advert_int 1 | advert_int 1 | ||
| - | + | ||
| track_script { | track_script { | ||
| chk_dienst_vip | chk_dienst_vip | ||
| } | } | ||
| - | + | ||
| authentication { | authentication { | ||
| auth_type PASS | auth_type PASS | ||
| auth_pass RaBoo9as | auth_pass RaBoo9as | ||
| } | } | ||
| - | + | ||
| virtual_ipaddress { | virtual_ipaddress { | ||
| 10.145.43.30/ | 10.145.43.30/ | ||
| } | } | ||
| - | + | ||
| - | unicast_src_ip 10.145.43.32 | + | # |
| - | unicast_peer { | + | |
| - | 10.145.43.31 | + | |
| - | # | + | unicast_peer { |
| - | 10.145.43.33 | + | 10.145.43.31 |
| - | } | + | # |
| + | 10.145.43.33 | ||
| + | } | ||
| } | } | ||
| </ | </ | ||
| Zeile 465: | Zeile 524: | ||
| vrrp_no_swap | vrrp_no_swap | ||
| } | } | ||
| - | + | ||
| vrrp_script chk_dienst_vip { | vrrp_script chk_dienst_vip { | ||
| - | script "/ | + | |
| interval 2 | interval 2 | ||
| - | weight | + | weight |
| - | fall 2 | + | fall 2 |
| } | } | ||
| - | + | ||
| vrrp_instance VIP { | vrrp_instance VIP { | ||
| - | interface bond2 | + | |
| + | interface vlan203 | ||
| + | virtual_router_id 203 | ||
| state BACKUP | state BACKUP | ||
| - | virtual_router_id 203 | ||
| priority 100 | priority 100 | ||
| + | #nopreempt | ||
| advert_int 1 | advert_int 1 | ||
| - | + | ||
| track_script { | track_script { | ||
| chk_dienst_vip | chk_dienst_vip | ||
| } | } | ||
| - | + | ||
| authentication { | authentication { | ||
| auth_type PASS | auth_type PASS | ||
| auth_pass RaBoo9as | auth_pass RaBoo9as | ||
| } | } | ||
| - | + | ||
| virtual_ipaddress { | virtual_ipaddress { | ||
| 10.145.43.30/ | 10.145.43.30/ | ||
| } | } | ||
| - | |||
| - | unicast_src_ip 10.145.43.33 | ||
| - | unicast_peer { | ||
| - | 10.145.43.31 | ||
| - | 10.145.43.32 | ||
| - | # | ||
| - | } | ||
| - | } | ||
| - | </ | ||
| - | <file bash / | + | |
| - | #!/ | + | |
| - | + | ||
| - | # | + | |
| - | # Dieses Skript prüft den Zugriff auf die einzelnen DB-Knoten. | + | |
| - | # Es wird erst ein Fehler ausgegeben, wenn kein DB-Knoten mehr erreichbar ist. | + | |
| - | # | + | |
| - | + | ||
| - | ### Wenn PostgreSQL+Patroni nicht mit ETCD, HAProxy und KeepaliveD auf dem selben Host liegen | + | |
| - | # | + | |
| - | #do | + | |
| - | # echo " | + | |
| - | # | + | |
| - | + | ||
| - | ### Nur wenn PostgreSQL+Patroni, | + | |
| - | AUSGABE=" | + | |
| - | # Test | + | unicast_src_ip 10.145.43.33 |
| - | #echo " | + | |
| - | + | | |
| - | if [ x = " | + | |
| - | echo "CRIT: kein DB-Knoten ist erreichbar" | + | # |
| - | | + | |
| - | else | + | } |
| - | echo "OK: ${AUSGABE}" | tr -s ' | + | |
| - | exit 0 | + | |
| - | fi | + | |
| </ | </ | ||
| Zeile 535: | Zeile 569: | ||
| # | # | ||
| - | # Dieses Skript prüft | + | # Dieses Skript prüft, |
| - | # Es wird erst ein Fehler ausgegeben, wenn kein DB-Knoten mehr erreichbar ist. | + | |
| # | # | ||
| - | HA_PORT=" | + | # Prüfe, ob HAProxy läuft und erreichbar ist |
| - | SS_AUSGABE="$(ss -antp | grep -F LISTEN | grep -F " | + | HOST="127.0.0.1" |
| - | HA_AUSGABE=" | + | PORT=5433 |
| - | if [ x = "x${SS_AUSGABE}" ] ; then | + | # timeout wichtig, damit Keepalived nicht hängt |
| - | echo "CRIT: Port nicht offen" | + | timeout 1 bash -c "</ |
| - | | + | |
| + | if [ $? -eq 0 ]; then | ||
| + | exit 0 # OK → VIP behalten | ||
| else | else | ||
| - | if [ 3 -eq " | + | |
| - | echo " | + | |
| - | exit 0 | + | |
| - | else | + | |
| - | echo "CRIT: haproxy antwortet nicht richtig" | + | |
| - | exit 2 | + | |
| - | fi | + | |
| fi | fi | ||
| </ | </ | ||
| Zeile 564: | Zeile 593: | ||
| <code text VIP testen> | <code text VIP testen> | ||
| > ping 10.145.43.30 | > ping 10.145.43.30 | ||
| + | </ | ||
| + | |||
| + | |||
| + | === VRRP-Kommunikation === | ||
| + | |||
| + | <code bash> | ||
| + | > # tcpdump -i vlan203 proto 112 -n | grep -F 'vrid 203' | ||
| + | tcpdump: verbose output suppressed, use -v[v]... for full protocol decode | ||
| + | listening on vlan203, link-type EN10MB (Ethernet), snapshot length 262144 bytes | ||
| + | 12: | ||
| + | 12: | ||
| + | 12: | ||
| + | 12: | ||
| </ | </ | ||
| Zeile 578: | Zeile 620: | ||
| <code bash Netzwerkzugriffe erlauben> | <code bash Netzwerkzugriffe erlauben> | ||
| - | > echo " | + | > echo " |
| + | > echo " | ||
| + | > echo " | ||
| > echo " | > echo " | ||
| > echo " | > echo " | ||
| Zeile 724: | Zeile 768: | ||
| superuser: | superuser: | ||
| username: postgres | username: postgres | ||
| - | password: | + | password: |
| replication: | replication: | ||
| username: repl | username: repl | ||
| - | password: | + | password: |
| pg_hba: | pg_hba: | ||
| - host replication all 0.0.0.0/0 scram-sha-256 | - host replication all 0.0.0.0/0 scram-sha-256 | ||
| + | - host replication all ::/0 scram-sha-256 | ||
| - host all all 0.0.0.0/0 scram-sha-256 | - host all all 0.0.0.0/0 scram-sha-256 | ||
| - host all all ::/0 scram-sha-256 | - host all all ::/0 scram-sha-256 | ||
| Zeile 781: | Zeile 826: | ||
| superuser: | superuser: | ||
| username: postgres | username: postgres | ||
| - | password: | + | password: |
| replication: | replication: | ||
| username: repl | username: repl | ||
| - | password: | + | password: |
| pg_hba: | pg_hba: | ||
| - host replication all 0.0.0.0/0 scram-sha-256 | - host replication all 0.0.0.0/0 scram-sha-256 | ||
| + | - host replication all ::/0 scram-sha-256 | ||
| - host all all 0.0.0.0/0 scram-sha-256 | - host all all 0.0.0.0/0 scram-sha-256 | ||
| - host all all ::/0 scram-sha-256 | - host all all ::/0 scram-sha-256 | ||
| Zeile 838: | Zeile 884: | ||
| superuser: | superuser: | ||
| username: postgres | username: postgres | ||
| - | password: | + | password: |
| replication: | replication: | ||
| username: repl | username: repl | ||
| - | password: | + | password: |
| pg_hba: | pg_hba: | ||
| - host replication all 0.0.0.0/0 scram-sha-256 | - host replication all 0.0.0.0/0 scram-sha-256 | ||
| + | - host replication all ::/0 scram-sha-256 | ||
| - host all all 0.0.0.0/0 scram-sha-256 | - host all all 0.0.0.0/0 scram-sha-256 | ||
| - host all all ::/0 scram-sha-256 | - host all all ::/0 scram-sha-256 | ||
| Zeile 854: | Zeile 901: | ||
| > systemctl disable postgresql | > systemctl disable postgresql | ||
| > cp / | > cp / | ||
| + | > cp / | ||
| > mkdir / | > mkdir / | ||
| Zeile 971: | Zeile 1019: | ||
| | Member | | Member | ||
| +--------------+--------------+---------+---------+----+-----------+ | +--------------+--------------+---------+---------+----+-----------+ | ||
| - | | pg-knoten-01 | pg-knoten-01 | Replica | running | | | + | | pg-knoten-01 | pg-knoten-03 | Replica | running | | |
| +--------------+--------------+---------+---------+----+-----------+ | +--------------+--------------+---------+---------+----+-----------+ | ||
| </ | </ | ||
| Zeile 978: | Zeile 1026: | ||
| > patronictl -c / | > patronictl -c / | ||
| Current cluster topology | Current cluster topology | ||
| - | + Cluster: pgcluster (7637152973093863500) | + | + Cluster: pgcluster (7637152973093863500) -------+----+-----------+ |
| | Member | | Member | ||
| +--------------+--------------+---------+---------+----+-----------+ | +--------------+--------------+---------+---------+----+-----------+ | ||
| - | | pg-knoten-01 | pg-knoten-01 | Replica | running | | | + | | pg-knoten-01 | pg-knoten-03 | Replica | running | | |
| +--------------+--------------+---------+---------+----+-----------+ | +--------------+--------------+---------+---------+----+-----------+ | ||
| Candidate [' | Candidate [' | ||
| Zeile 1009: | Zeile 1057: | ||
| Wenn das so aussieht, dann kann von den Replicas nicht gelesen werden. | Wenn das so aussieht, dann kann von den Replicas nicht gelesen werden. | ||
| - | Um Problem | + | Um das Problem |
| Zeile 1059: | Zeile 1107: | ||
| pg3 UP -1 | pg3 UP -1 | ||
| BACKEND UP -1 | BACKEND UP -1 | ||
| + | </ | ||
| + | |||
| + | |||
| + | === Probleme nach kompletten Cluster-Neustart === | ||
| + | |||
| + | <code bash IST-Zustand> | ||
| + | > patronictl -c / | ||
| + | + Cluster: pgcluster (7637598528541987072) -------+----+-----------+ | ||
| + | | Member | ||
| + | +--------------+--------------+---------+---------+----+-----------+ | ||
| + | | pg-knoten-01 | pg-knoten-01 | Replica | running | 13 | 217 | | ||
| + | | pg-knoten-02 | pg-knoten-02 | Replica | running | 13 | 217 | | ||
| + | | pg-knoten-03 | pg-knoten-03 | Replica | running | 13 | 217 | | ||
| + | +--------------+--------------+---------+---------+----+-----------+ | ||
| + | </ | ||
| + | |||
| + | <code bash einen zum Leader bestimmen> | ||
| + | > patronictl -c / | ||
| + | + Cluster: pgcluster (7637598528541987072) -------+----+-----------+ | ||
| + | | Member | ||
| + | +--------------+--------------+---------+---------+----+-----------+ | ||
| + | | pg-knoten-01 | pg-knoten-01 | Replica | running | 13 | 217 | | ||
| + | | pg-knoten-02 | pg-knoten-02 | Replica | running | 13 | 217 | | ||
| + | | pg-knoten-03 | pg-knoten-03 | Replica | running | 13 | 217 | | ||
| + | +--------------+--------------+---------+---------+----+-----------+ | ||
| + | Candidate [' | ||
| + | Are you sure you want to failover cluster pgcluster? [y/N]: y | ||
| + | 2026-05-13 13: | ||
| + | + Cluster: pgcluster (7637598528541987072) -------+----+-----------+ | ||
| + | | Member | ||
| + | +--------------+--------------+---------+---------+----+-----------+ | ||
| + | | pg-knoten-01 | pg-knoten-01 | Leader | ||
| + | | pg-knoten-02 | pg-knoten-02 | Replica | running | 13 | 217 | | ||
| + | | pg-knoten-03 | pg-knoten-03 | Replica | running | 13 | 217 | | ||
| + | +--------------+--------------+---------+---------+----+-----------+ | ||
| + | |||
| + | > patronictl -c / | ||
| + | + Cluster: pgcluster (7637598528541987072) ---------+----+-----------+ | ||
| + | | Member | ||
| + | +--------------+--------------+---------+-----------+----+-----------+ | ||
| + | | pg-knoten-01 | pg-knoten-01 | Leader | ||
| + | | pg-knoten-02 | pg-knoten-02 | Replica | streaming | 14 | 0 | | ||
| + | | pg-knoten-03 | pg-knoten-03 | Replica | streaming | 14 | 0 | | ||
| + | +--------------+--------------+---------+-----------+----+-----------+ | ||
| </ | </ | ||
| Zeile 1227: | Zeile 1319: | ||
| //Immer erst mit Dirk Hartmann absprechen!// | //Immer erst mit Dirk Hartmann absprechen!// | ||
| - | <file bash / | + | <code bash Datenbanken und User zeigen> |
| - | # | + | |
| - | + | ||
| - | VERSION=" | + | |
| - | + | ||
| - | # KeepaliveD prüfen | + | |
| - | service keepalived status >/ | + | |
| - | + | ||
| - | # IP + VIP | + | |
| - | ip a | grep -F 'inet 10.145.43.3' | + | |
| - | + | ||
| - | echo "# | + | |
| - | # PostgreSQL Commits | + | |
| - | sudo -u postgres psql -t -c " | + | |
| - | + | ||
| - | # Active Connections | + | |
| - | sudo -u postgres psql -t -c " | + | |
| - | + | ||
| - | # HAProxy abfragen | + | |
| - | echo "# | + | |
| - | #links -ssl.certificates 0 -dump http:// | + | |
| - | curl -s " | + | |
| - | + | ||
| - | # Patroni abfragen | + | |
| - | patronictl -c / | + | |
| - | + | ||
| - | # EtcD abfragen | + | |
| - | etcdctl --endpoints=" | + | |
| - | + | ||
| - | # Patroni abfragen | + | |
| - | #echo "# | + | |
| - | #curl -i http:// | + | |
| - | # | + | |
| - | for HIP in 10.145.43.31 10.145.43.32 10.145.43.33 | + | |
| - | do | + | |
| - | echo "# | + | |
| - | #curl -i http:// | + | |
| - | curl -s http:// | + | |
| - | done | + | |
| - | </ | + | |
| - | + | ||
| - | <code bash Netzwerkzugriff per HAProxy prüfen> | + | |
| > echo " | > echo " | ||
| Password for user postgres: | Password for user postgres: | ||
| Zeile 1290: | Zeile 1341: | ||
| | | ||
| </ | </ | ||
| - | |||
| - | //Wie das beste Vorgehen bei Wartungsarbeiten ist, muß erst noch ermittelt ausgelotet werden.// | ||
| <code bash> | <code bash> | ||
| Zeile 1300: | Zeile 1349: | ||
| > reboot && exit | > reboot && exit | ||
| </ | </ | ||
| + | |||
| ==== Status / Monitoring ==== | ==== Status / Monitoring ==== | ||
| Zeile 1554: | Zeile 1604: | ||
| <code bash> | <code bash> | ||
| - | [root@pg-knoten-03]---------------------------------------------------------------------[0]-[0]-[2026-05-12 14:26:12] | + | [root@FRA2C-MEDOS04]---------------------------------------------------------------------[0]-[0]-[2026-05-12 14:26:12] |
| [~]# patronictl -c / | [~]# patronictl -c / | ||
| Current cluster topology | Current cluster topology | ||
| Zeile 1575: | Zeile 1625: | ||
| +--------------+--------------+---------+---------+----+-----------+ | +--------------+--------------+---------+---------+----+-----------+ | ||
| - | [root@pg-knoten-03]---------------------------------------------------------------------[0]-[0]-[2026-05-12 14:26:55] | + | [root@FRA2C-MEDOS04]---------------------------------------------------------------------[0]-[0]-[2026-05-12 14:26:55] |
| [~]# patronictl -c / | [~]# patronictl -c / | ||
| + Cluster: pgcluster (7637598528541987072) ---------+----+-----------+ | + Cluster: pgcluster (7637598528541987072) ---------+----+-----------+ | ||
| Zeile 1622: | Zeile 1672: | ||
| ===== PostgreSQL-Tuning ===== | ===== PostgreSQL-Tuning ===== | ||
| + | |||
| + | <code bash Daten asynchron einspielen, das geht etwas schneller> | ||
| + | > psql -c "SET synchronous_commit=off;" | ||
| + | </ | ||
| <code text> | <code text> | ||
| Zeile 1641: | Zeile 1695: | ||
| <file bash / | <file bash / | ||
| #!/bin/bash | #!/bin/bash | ||
| + | |||
| + | # | ||
| + | VERSION=" | ||
| + | |||
| + | # | ||
| + | CLUSTER_NAME=" | ||
| ### RAM: 1547643 MB | ### RAM: 1547643 MB | ||
| Zeile 1673: | Zeile 1733: | ||
| " | " | ||
| - | AUTOVACUUM_WORK_MEM=" | + | AUTOVACUUM_WORK_MEM=" |
| - | EFFECTIVE_CACHE_SIZE=" | + | EFFECTIVE_CACHE_SIZE=" |
| - | MAINTENANCE_WORK_MEM=" | + | MAINTENANCE_WORK_MEM=" |
| - | SHARED_BUFFERS=" | + | MAX_CONNECTIONS=" |
| - | WORK_MEM=" | + | SHARED_BUFFERS=" |
| + | WORK_MEM=" | ||
| + | MAX_WAL_SIZE=" | ||
| echo "# empfohlene Werte, die meistens passen: | echo "# empfohlene Werte, die meistens passen: | ||
| - | autovacuum_work_mem: | + | autovacuum_work_mem: |
| effective_cache_size: | effective_cache_size: | ||
| - | maintenance_work_mem: | + | maintenance_work_mem: |
| - | shared_buffers: | + | max_connections: |
| - | work_mem: ${WORK_MEM} (ca. 1% vom RAM; Vorsicht, kann zu OOM-Kill führen!) | + | shared_buffers: |
| + | max_wal_size: | ||
| + | work_mem: | ||
| " | " | ||
| - | </ | ||
| - | <code bash Konfigurationsvorschlag vorher ansehen> | + | ### neu berechnete Werte, direkt setzen |
| - | > /root/bin/PostgreSQL-Tuning.sh | + | ### patronictl -c /etc/patroni/config.yml edit-config pgcluster |
| - | RAM: 1547643 MB | + | echo "# |
| + | echo | ||
| + | jq -n \ | ||
| + | --arg ecs " | ||
| + | --arg mwm " | ||
| + | --arg mc " | ||
| + | --arg sb " | ||
| + | --arg mw " | ||
| + | --arg wm " | ||
| + | '{ | ||
| + | postgresql: { | ||
| + | parameters: { | ||
| + | effective_cache_size: | ||
| + | maintenance_work_mem: | ||
| + | max_connections: | ||
| + | shared_buffers: | ||
| + | max_wal_size: | ||
| + | wal_buffers: | ||
| + | checkpoint_timeout: | ||
| + | work_mem: $wm | ||
| + | } | ||
| + | } | ||
| + | }' | curl -s -X PATCH http:// | ||
| + | -H " | ||
| + | --data-binary @- | ||
| + | echo | ||
| + | echo "# | ||
| - | # IST-Werte (ggf. autovacuum_work_mem = maintenance_work_mem): | + | ### aktuelle Einstellungen anzeigen |
| - | effective_cache_size 4096 MB | + | patronictl |
| - | maintenance_work_mem 64 MB | + | |
| - | max_connections 100 | + | |
| - | shared_buffers 128 MB | + | |
| - | work_mem 4 MB | + | |
| - | # empfohlene Werte, die meistens passen: | + | ### neue Einstellungen aktivieren |
| - | autovacuum_work_mem: | + | ### shared_buffers |
| - | effective_cache_size: | + | service patroni restart |
| - | maintenance_work_mem: | + | </file> |
| - | shared_buffers: 378 GB (ca. 10-25% vom RAM) | + | |
| - | work_mem: 15 GB (ca. 1% vom RAM; Vorsicht, kann zu OOM-Kill führen!) | + | |
| - | </ | + | |
| - | + | ||
| - | <code bash Patroni konfigurieren> | + | |
| - | > patronictl -c / | + | |
| - | loop_wait: 10 | + | |
| - | maximum_lag_on_failover: | + | |
| - | postgresql: | + | |
| - | parameters: | + | |
| - | hot_standby: | + | |
| - | max_replication_slots: | + | |
| - | max_wal_senders: | + | |
| - | wal_level: replica | + | |
| - | effective_cache_size: | + | |
| - | maintenance_work_mem: | + | |
| - | shared_buffers: | + | |
| - | work_mem: 8GB | + | |
| - | use_pg_rewind: | + | |
| - | retry_timeout: | + | |
| - | ttl: 30 | + | |
| - | + | ||
| - | Apply these changes? [y/N]: y | + | |
| - | Configuration changed | + | |
| - | + | ||
| - | + | ||
| - | > service patroni restart | + | |
| - | </code> | + | |
| <code bash RAM: 1547643 MB> | <code bash RAM: 1547643 MB> | ||
| Zeile 1751: | Zeile 1808: | ||
| # IST-Werte (ggf. autovacuum_work_mem = maintenance_work_mem): | # IST-Werte (ggf. autovacuum_work_mem = maintenance_work_mem): | ||
| effective_cache_size 1134 GB | effective_cache_size 1134 GB | ||
| - | maintenance_work_mem 75 GB | + | maintenance_work_mem 151 GB |
| - | max_connections 100 | + | max_connections 150 |
| shared_buffers 378 GB | shared_buffers 378 GB | ||
| - | work_mem 8192 MB | + | work_mem 15 GB |
| # empfohlene Werte, die meistens passen: | # empfohlene Werte, die meistens passen: | ||
| - | autovacuum_work_mem: | + | autovacuum_work_mem: |
| - | effective_cache_size: | + | effective_cache_size: |
| - | maintenance_work_mem: | + | maintenance_work_mem: |
| - | shared_buffers: | + | max_connections: |
| - | work_mem: | + | shared_buffers: |
| + | work_mem: | ||
| + | |||
| + | # | ||
| + | |||
| + | {" | ||
| + | # | ||
| + | loop_wait: 10 | ||
| + | maximum_lag_on_failover: | ||
| + | postgresql: | ||
| + | parameters: | ||
| + | effective_cache_size: | ||
| + | hot_standby: | ||
| + | maintenance_work_mem: | ||
| + | max_connections: | ||
| + | max_replication_slots: | ||
| + | max_wal_senders: | ||
| + | shared_buffers: | ||
| + | wal_level: replica | ||
| + | work_mem: 15GB | ||
| + | use_pg_rewind: | ||
| + | retry_timeout: | ||
| + | ttl: 30 | ||
| </ | </ | ||
| Zeile 1780: | Zeile 1859: | ||
| " | sudo -u postgres psql | " | sudo -u postgres psql | ||
| </ | </ | ||
| + | |||
| + | <code bash> | ||
| + | > / | ||
| + | pid | backend_type | state | ||
| + | -----+--------------+------- | ||
| + | (0 rows) | ||
| + | |||
| + | | ||
| + | -------------------+------- | ||
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| + | (5 rows) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ===== Tests ===== | ||
| + | |||
| + | <code bash> | ||
| + | > time sudo -u postgres pgbench -i -s 1000 testdb | ||
| + | dropping old tables... | ||
| + | creating tables... | ||
| + | generating data (client-side)... | ||
| + | 100000000 of 100000000 tuples (100%) done (elapsed 85.93 s, remaining 0.00 s) | ||
| + | vacuuming... | ||
| + | creating primary keys... | ||
| + | done in 125.30 s (drop tables 4.04 s, create tables 0.01 s, client-side generate 86.85 s, vacuum 0.84 s, primary keys 33.56 s). | ||
| + | |||
| + | real 2m5.365s | ||
| + | user 0m0.038s | ||
| + | sys 0m0.077s | ||
| + | |||
| + | > sudo -u postgres pgbench -c 128 -j 128 -T 120 -N testdb | ||
| + | pgbench (16.13 (Ubuntu 16.13-0ubuntu0.24.04.1)) | ||
| + | starting vacuum...end. | ||
| + | transaction type: < | ||
| + | scaling factor: 1000 | ||
| + | query mode: simple | ||
| + | number of clients: 128 | ||
| + | number of threads: 128 | ||
| + | maximum number of tries: 1 | ||
| + | duration: 120 s | ||
| + | number of transactions actually processed: 5956017 | ||
| + | number of failed transactions: | ||
| + | latency average = 2.578 ms | ||
| + | initial connection time = 66.949 ms | ||
| + | tps = 49656.884083 (without initial connection time) | ||
| + | |||
| + | > sudo -u postgres pgbench -c 250 -j 250 -T 120 -N testdb | ||
| + | pgbench (16.13 (Ubuntu 16.13-0ubuntu0.24.04.1)) | ||
| + | starting vacuum...end. | ||
| + | transaction type: < | ||
| + | scaling factor: 1000 | ||
| + | query mode: simple | ||
| + | number of clients: 250 | ||
| + | number of threads: 250 | ||
| + | maximum number of tries: 1 | ||
| + | duration: 120 s | ||
| + | number of transactions actually processed: 6463558 | ||
| + | number of failed transactions: | ||
| + | latency average = 4.637 ms | ||
| + | initial connection time = 154.766 ms | ||
| + | tps = 53915.406631 (without initial connection time) | ||
| + | </ | ||
| + | |||
| + | <code bash WAL> | ||
| + | > echo " | ||
| + | | ||
| + | ----------------+--------------- | ||
| + | 0 | 0 | ||
| + | (1 row) | ||
| + | </ | ||
| + | |||
| + | <code bash ZFS I/O> | ||
| + | > zpool iostat -v | ||
| + | | ||
| + | pool | ||
| + | ----------- | ||
| + | pg_datadir | ||
| + | mirror-0 | ||
| + | nvme2n1 | ||
| + | nvme3n1 | ||
| + | ----------- | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== Testdaten einspielen ==== | ||
| + | |||
| + | <code bash> | ||
| + | > echo "COPY tabelle FROM '/ | ||
| + | |||
| + | oder | ||
| + | |||
| + | > cat / | ||
| + | </ | ||
| + | |||
| + | <code bash Datensätze zählen> | ||
| + | > echo " | ||
| + | </ | ||
| + | |||
| + | |||
| + | ===== Verbindungen SSL/ | ||
| + | |||
| + | |||
| + | ==== Verschlüsselung konfigurieren ==== | ||
| + | |||
| + | <code bash postgresql.conf> | ||
| + | > patronictl -c / | ||
| + | loop_wait: 10 | ||
| + | maximum_lag_on_failover: | ||
| + | postgresql: | ||
| + | parameters: | ||
| + | hot_standby: | ||
| + | max_replication_slots: | ||
| + | max_wal_senders: | ||
| + | wal_level: replica | ||
| + | ssl: on | ||
| + | ssl_ca_file: | ||
| + | ssl_cert_file: | ||
| + | ssl_key_file: | ||
| + | ssl_min_protocol_version: | ||
| + | ssl_ciphers: | ||
| + | ssl_prefer_server_ciphers: | ||
| + | use_pg_rewind: | ||
| + | retry_timeout: | ||
| + | ttl: 30 | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== Verschlüsselung erzwingen ==== | ||
| + | |||
| + | <code bash pg_hba.conf> | ||
| + | > vim / | ||
| + | ... | ||
| + | pg_hba: | ||
| + | - hostnossl all all 0.0.0.0/0 reject | ||
| + | - hostnossl all all ::/0 reject | ||
| + | - hostssl replication all 0.0.0.0/0 scram-sha-256 | ||
| + | - hostssl replication all ::/0 scram-sha-256 | ||
| + | - hostssl all all 0.0.0.0/0 scram-sha-256 | ||
| + | - hostssl all all ::/0 scram-sha-256 | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== Verschlüsselung aktivieren ==== | ||
| + | |||
| + | <code bash kompletten Cluster restarten> | ||
| + | > patronictl -c / | ||
| + | </ | ||
| + | |||
| + | |||
| + | ===== temp_tablespaces ===== | ||
| + | |||
| + | <code bash temp_tablespaces-Verzeichnis anlegen> | ||
| + | > mkdir -p / | ||
| + | > chown postgres: | ||
| + | > chmod 0700 / | ||
| + | </ | ||
| + | |||
| + | <code bash temp_tablespaces konfigurieren> | ||
| + | > echo " | ||
| + | > echo " | ||
| + | > echo " | ||
| + | |||
| + | > patronictl -c / | ||
| + | ... | ||
| + | postgresql: | ||
| + | parameters: | ||
| + | temp_file_limit: | ||
| + | temp_tablespaces: | ||
| + | ... | ||
| + | </ | ||
/home/http/wiki/data/attic/datenbank/ha-postgresql-cluster.1778611742.txt · Zuletzt geändert: von manfred
