Inhaltsverzeichnis

GlusterFS (von Gluster Inc. entwickelt)
- Infos
- Tests vom 2009-11-18
- Tests aus 2012
  - Server
  - Client
    - mounten
  - Tests
    - Server
    - Client
    - Server
  - Eigenart

GlusterFS (von Gluster Inc. entwickelt)

http://de.wikipedia.org/wiki/GlusterFS

GlusterFS ist ein verteiltes Dateisystem, das Speicherelemente von mehreren Servern als einheitliches Dateisystem präsentiert. Die verschiedenen Server, auch Cluster-Nodes (engl. node: Knoten) genannt, bilden eine Client-Server-Architektur über TCP/IP. Die Daten auf allen Cluster-Nodes können gleichzeitig gelesen und geschrieben werden, wobei alle Änderungen an Dateien auf allen Servern augenblicklich umgesetzt werden. Das Dateisystem wird über ein FUSE-Kernel-Modul eingebunden und wird von POSIX-fähigen Betriebssystemen unterstützt, zum Beispiel Linux, FreeBSD, OpenSolaris und Mac OS X. Um einen GlusterFS-Server zu starten, wird aber kein Kernel-Modul benötigt. Ein Server kann sowohl Client als auch Server gleichzeitig sein.

Die Entwicklung von GlusterFS begann Mitte 2005 durch das GlusterOS-Entwicklerteam von Z Research Inc., ein erstes Release des Dateisystems wurde im Juli 2006 veröffentlicht. GlusterFS ist unter der GPL in Version 3 lizenziert. Die Entwickler bieten kostenpflichtigen Support an.

Mit GlusterFS lässt sich eine Art Netzwerk-RAID erstellen, von welchem aus mehrere Rechner gleichzeitig auf ein gemeinsames Dateisystem zugreifen können. Es unterliegt hierbei nicht Limitierungen wie der, maximal zwei Server nutzen zu können, wie es zum Beispiel bei einer ähnlichen HA-Lösung wie DRBD der Fall ist. GlusterFS ist fehlertolerant, da bei GlusterFS Nutzdaten, Metadaten und Namespace verteilt gespeichert werden können. Durch jeden weiteren GlusterFS-Server erhöht sich der maximale Datendurchsatz des Dateisystems, so dass hier I/O-Bandbreite von einigen GiB pro Sekunde erreicht werden können.

Bei Prozessoren gilt das Moorsche Gesetz, was jedoch bei Speichermedien und Storage-Lösungen nicht zutrifft, obwohl hier ebenfalls ein Bedarf nach größeren und schnelleren Speichern besteht. Oftmals ist nicht die CPU-Leistung eines Servers der Flaschenhals, sondern immer öfter die zu langsamen Datenspeicher des Systems. GlusterFS schafft hier Abhilfe durch die Möglichkeit, beliebig zu skalieren.

Infos

http://www.golem.de/0905/67174.html

Stand: 18.05.2009

GlusterFS nutzt existierende Dateisysteme wie Ext3 oder XFS, um die Daten zu speichern. Mittels GlusterFS lassen sich dann verschiedene Server zu einem Cluster zusammenschließen. GlusterFS selbst liegt dabei nicht als Kernel-Modul vor, sondern wird als Fuse-Modul im Userspace eingebunden. Damit läuft es nicht nur unter verschiedenen Unix-Systemen, sondern soll vor allem bei einem Absturz den Kernel verschonen.

Das System selbst ist modular aufgebaut, so dass sich die benötigten Funktionen einfach nachrüsten lassen. Dabei verzichtet GlusterFS auf das klassische fsck und überprüft das Dateisystem stattdessen im Hintergrund, soll mit mehreren PByte Daten zurechtkommen, unterstützt automatische Replikation und weitere Funktionen wie den Zugriff auf die Daten per SCP.

Version 2.0 der Software bietet eine Hochverfügbarkeitsoption, unterstützt Non-Blocking-Socket-Verbindungen zwischen Client und Server und verwendet ein neues Protokoll, das die Leistung steigern soll.

Das GlusterFS-Fuse-Modul steht unter der GPLv3.

Tests vom 2009-11-18

# http://www.howtoforge.com/high-availability-storage-cluster-with-glusterfs-on-ubuntu
#aptitude install glusterfs-server glusterfs-client sshfs build-essential flex bison byacc wget

### Server
aptitude install glusterfs-server glusterfs-client

vi /etc/glusterfs/glusterfsd.vol
      option directory /home/export

mkdir /home/export
/etc/init.d/glusterfs-server start

tail -1 /var/log/glusterfs/*glusterfsd*.log
[2009-11-18 18:02:13] N [glusterfsd.c:1198:main] glusterfs: Successfully started

netstat -ant
Aktive Internetverbindungen (Server und stehende Verbindungen)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:6996            0.0.0.0:*               LISTEN

### Client
#
# Der Client braucht mindestens eine Datei, die im Paket glusterfs-server drin ist.
#
aptitude install glusterfs-server glusterfs-client

vi /etc/glusterfs/glusterfs.vol
      option remote-host 192.168.0.70

mkdir /home/glusterfs
mount -t glusterfs 192.168.0.70 /home/glusterfs
#mount -o direct-io-mode=DISABLE -t glusterfs 192.168.0.70 /home/glusterfs

mount
....
192.168.0.70 on /home/glusterfs type fuse.glusterfs (rw,max_read=131072,allow_other,default_permissions)

netstat -ant | fgrep 6996

http://www.gluster.com/community/documentation/index.php/Setting_up_AFR_on_two_servers_with_client_side_replication

1. erweiterte Atribute entfernen
aptitude install attr
setfattr -x trusted.glusterfs.version /home/export
setfattr -x trusted.glusterfs.createtime /home/export

http://www.gluster.com/community/documentation/index.php/Translators/cluster/replicate
http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator
http://www.mail-archive.com/gluster-users@gluster.org/msg00489.html
      F: I am wondering what the difference between cluster/afr and cluster/replicate?
      A: AFR was the earlier name. We changed it to a more user-friendly "replicate".
      F: Which backend filesystem should I use for AFR?
      A: You can use any backend filesystem that supports extended attributes.
#------------------------------------------------------------------------------#
useradd glusterfs
#------------------------------------------------------------------------------#
ganz einfach, ohne AFR (Replikation muss manuell angestoßen werden)
===================================================================
http://gluster.org/pipermail/gluster-users/20090114/001402.html

         storage/posix (volume brick-ds)
               |
        features/locks (volume brick)
               |
        protocol/server (subvolumes brick)
              | |
              | |  network (tcp, infiniband)
              | |
        protocol/client (brick)
               |
       cluster/replicate
               |
     performance/read-ahead
               |
      performance/io-cache
               |
    performance/write-behind
               |
              /mnt

#------------------------------------------------------------------------------#
mit AFR (Automatic File Replication)
====================================

     storage/posix (volume brick-ds)             storage/posix (volume ns)
           |                                           |
    features/locks (volume brick)                      |
           |                                           |
    protocol/server (subvolumes brick)                 |
           |                                           |
           +--------------------+-+--------------------+
                                | |
                                | |
                                | |  network (tcp, infiniband)
                                | |
                                | |
               +----------------+-+-------+---------------------------------+
               |                          |                                 |
        protocol/client (brick)    protocol/client (brick)           protocol/client (remote-subvolume ns)
               |                          |                                 |
       cluster/replicate             cluster/afr (subvolumes rm1 rm2)    cluster/afr (subvolumes ns1 ns2)
               |                          |                                 |
     performance/read-ahead               +----------------+----------------+
               |                                           |
      performance/io-cache                           cluster/unify
               |
    performance/write-behind
               |
              /mnt

#------------------------------------------------------------------------------#

http://www.gluster.com/community/documentation/index.php/GlusterFS_User_Guide

http://www.gluster.com/community/documentation/index.php/Storage_Server_Installation_and_Configuration#Creating_Replicated_Volumes
http://www.gluster.com/community/documentation/index.php/Client_Installation_and_Configuration


http://www.gluster.com/community/documentation/index.php/GlusterFS_2.0.6
http://www.gluster.com/community/documentation/index.php/GlusterFS_2.0.6#Configuration
http://www.gluster.com/community/documentation/index.php/Mixing_DHT_and_AFR
http://www.gluster.com/community/documentation/index.php/Client_Installation_and_Configuration
http://www.gluster.com/community/documentation/index.php/Mounting_a_GlusterFS_Volume


ssh root@10.10.10.1 'mkdir -p /home/export /home/GlusterFS-NS'
ssh root@10.10.10.2 'mkdir -p /home/export /home/GlusterFS-NS'



echo "
#####################################
###  GlusterFS Server Volume File  ##
#####################################

#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line. 
### - Multiple values to options will be : delimitted.
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.

### Export volume "brick" with the contents of "/home/export" directory.
volume brick-ds
  type storage/posix                                    # POSIX FS translator
  option directory /home/export                         # Export this directory
#  option o-direct enable       # (default: disable) boolean type only
#  option export-statfs-size no # (default: yes)     boolean type only
#  option mandate-attribute off # (default: on)      boolean type only
#  option span-devices 8        # (default: 0)       integer value
#  option background-unlink yes # (default: no)      boolean type
end-volume

### Add network serving capability to above brick.
##volume server
##  type protocol/server
##  option transport-type tcp
# option transport-type unix
# option transport-type ib-sdp
# option transport.socket.bind-address 192.168.1.10     # Default is to listen on all interfaces
# option transport.socket.listen-port 6996              # Default is 6996

# option transport-type ib-verbs
# option transport.ib-verbs.bind-address 192.168.1.10   # Default is to listen on all interfaces
# option transport.ib-verbs.listen-port 6996            # Default is 6996
# option transport.ib-verbs.work-request-send-size  131072
# option transport.ib-verbs.work-request-send-count 64
# option transport.ib-verbs.work-request-recv-size  131072
# option transport.ib-verbs.work-request-recv-count 64

# option client-volume-filename /etc/glusterfs/glusterfs-client.vol
##  subvolumes brick
# NOTE: Access to any volume through protocol/server is denied by
# default. You need to explicitly grant access through # "auth"
# option.
##  option auth.addr.brick.allow * # Allow access to "brick" volume
##end-volume


# POSIX-locks
volume brick
  type features/locks                   # features/posix-locks gives same error
  # The write-behind xlator does not cache anything for files which have

  # mandatory locking enabled, to avoid incoherence.
#  option mandatory on                  # on or off, same problem
  subvolumes brick-ds
end-volume

volume server
  type protocol/server
  option transport-type tcp/server
  subvolumes brick
  option auth.ip.brick.allow 127.0.0.1,10.10.10.*
end-volume

" | ssh root@10.10.10.1 'cat > /etc/glusterfs/glusterfsd.vol;/etc/init.d/glusterfs-server restart'
... ssh root@10.10.10.2 'cat > /etc/glusterfs/glusterfsd.vol;/etc/init.d/glusterfs-server restart'


# man kann einen so exportierten Bereich wie folgt mounten (wie NFS):
# mount -t glusterfs 10.10.10.1 /mnt/glusterfs
# netstat -ant | fgrep 6996
#
# vi /etc/fstab
# 10.10.10.1  /home/glusterfs glusterfs       defaults        0 0
#




echo "
#####################################
###  GlusterFS Client Volume File  ##
#####################################

#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line. 
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.

### Add client feature and attach to remote subvolume
#####################################
###  GlusterFS Client Volume File  ##
#####################################

#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line. 
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.

### Add client feature and attach to remote subvolume
volume client
  type protocol/client
  option transport-type tcp/client
# option transport-type unix
# option transport-type ib-sdp
#  option remote-host 127.0.0.1         # IP address of the remote brick
  option remote-host 10.10.10.1
# option transport.socket.remote-port 6996              # default server port is 6996

# option transport-type ib-verbs
# option transport.ib-verbs.remote-port 6996              # default server port is 6996
# option transport.ib-verbs.work-request-send-size  1048576
# option transport.ib-verbs.work-request-send-count 16
# option transport.ib-verbs.work-request-recv-size  1048576

# option transport.ib-verbs.work-request-recv-count 16

# option transport-timeout 30          # seconds to wait for a reply
                                       # from server for each request
  option remote-subvolume brick        # name of the remote volume
end-volume

#
# Teil 2 vom Mirror
#
volume client2
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.10.10.2
  option remote-subvolume brick
end-volume

#
# AFR (Automatic File Replication) hat in meinem Test nichts gebracht.
# Auch ohne AFR wurden die Dateien bei einem "find" bzw. "ls -lR" repliziert.
# Mit AFR wurden die Daten aber ohne "find" auch nicht repliziert.
# AFR kann erst ab der Version 2.1 gelöschte Verzeichnisse
# rekursiv replizieren also auf einen Server, der das löschen eines
# Verzeichnisbaumes nicht mitbekommen hat, das noch vorhandene Verzeichnis
# entfernen.
# Ohne AFR geht das sowieso nicht, aber was bringt AFR den dann?
# Somit ist es erst ab Version 2.1 von Nutzem.
#
volume brick-replicate
  type cluster/replicate
  subvolumes client1 client2
  option data-self-heal on
  option metadata-self-heal on
  option entry-self-heal on
end-volume

#
# Add readahead feature
#
volume readahead
  type performance/read-ahead
  option page-size 1MB          # unit in bytes
  option page-count 2           # cache per file  = (page-count x page-size)
  subvolumes brick-replicate
end-volume

#
# Cache
#
volume iocache
  type performance/io-cache
  option cache-size 512MB
  option page-size 256KB
  option page-count 2
  subvolumes readahead
end-volume

#
# schreiben im Hintergrund
#
volume writebehind
  type performance/write-behind
  option aggregate-size 1MB
  option window-size 2MB
  option flush-behind off
  subvolumes iocache
end-volume
" >> /etc/glusterfs/glusterfs.vol


# man kann einen so exportierten Bereich wie folgt mounten (wie NFS):
# mount -t glusterfs /etc/glusterfs/glusterfs.vol /mnt/glusterfs
# netstat -ant | fgrep 6996

#
# vi /etc/fstab
# /etc/glusterfs/glusterfs.vol        /home/glusterfs glusterfs       defaults        0 0
#

#------------------------------------------------------------------------------#

# GlusterFS Server - Log-File
less /var/log/glusterfs/-etc-glusterfs-glusterfsd.vol.log

# GlusterFS Client - Log-File
less /var/log/glusterfs/home-glusterfs.log

#------------------------------------------------------------------------------#

aptitude install samba smbfs
smbpasswd -a glusterfs


vi /etc/samba/smb.conf

[global]
   workgroup = GLUSTER
   server string = %h server (Windoofs)
   wins support = no
   dns proxy = no
      hosts allow = 127. 128.32. 192.168.
      load printers = no
      socket options = TCP_NODELAY
      disable netbios = yes
      invalid users = root
      strict sync = yes
      sync always = yes
      unix extensions = yes
      case sensitive = yes
   log file = /var/log/samba/log.%m
   max log size = 1000
   syslog = 0
   panic action = /usr/share/samba/panic-action %d
   security = user
   encrypt passwords = true
   passdb backend = tdbsam
   obey pam restrictions = yes
   unix password sync = no
   passwd program = /usr/bin/passwd %u
   passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
   pam password change = yes
   map to guest = bad user
   usershare allow guests = no

[homes]
   comment = Home Directories
   browseable = no
   create mask = 0700
   directory mask = 0700
   valid users = %S
   writable = yes

#------------------------------------------------------------------------------#

id
uid=1001 gid=1000


vi /etc/fstab
//192.168.0.70/glusterfs        /glusterfs      cifs    noauto,rw,sfu,noperm,iocharset=utf8,uid=1001,gid=1000,credentials=/root/.mount.glusterfs        0 0

#------------------------------------------------------------------------------#
eth0: mit Cross-Over an den zweiten Gluster-Server im RAID-1-Verbund
eth1: Samba-Freigabe des GlusterFS (Verbindung nach draussen)

 IPTraf
┌ Iface ─────── Total ───────── IP ──────────── NonIP ───────── BadIP ───────── Activity ───────────────┐
│ lo             2353            2353         0               0               3018,80 kbits/sec       │

│ eth0                   1083            1083         0               0               1555,40 kbits/sec       │
│ eth1                   2550            2550         0               0               2778,80 kbits/sec       │
 lo  : 100%
 eth0:  51%
 eth1:  92%

 IPTraf
┌ Iface ─────── Total ───────── IP ──────────── NonIP ───────── BadIP ───────── Activity ───────────────┐
│ lo            40212           40212         0               0               3320,00 kbits/sec       │
│ eth0                  18458           18458         0               0               1710,80 kbits/sec       │
│ eth1                  47601           47601         0               0               3043,40 kbits/sec       │
│                                                                                                     │
 lo  : 100%
 eth0:  52%
 eth1:  92%

 IPTraf
┌ Iface ─────── Total ───────── IP ──────────── NonIP ───────── BadIP ───────── Activity ───────────────┐
│ lo            44178           44178         0               0               3503,00 kbits/sec       │
│ eth0                  20276           20276         0               0               1804,00 kbits/sec       │
│ eth1                  51906           51906         0               0               3155,00 kbits/sec       │
│                                                                                                     │
 lo  : 100%
 eth0:  51%
 eth1:  90%

 IPTraf
┌ Iface ─────── Total ───────── IP ──────────── NonIP ───────── BadIP ───────── Activity ───────────────┐
│ lo          1430020         1430020         0               0               2197,80 kbits/sec       │
│ eth0                 619967          619967         0               0               1103,40 kbits/sec       │
│ eth1                1863495         1863495         0               0               2315,40 kbits/sec       │
│                                                                                                     │
Auf dem Rechner ist auch noch ein Squid-Proxy eingerichtet.
Wenn über den Proxy Daten laufen, kann man hier schön sehen wie der Gesamt-Traffik zurück geht.
 lo  : 100%
 eth0:  50%
 eth1: 105%



Automatic File Replication (AFR)
================================

 IPTraf
┌ Iface ─────── Total ───────── IP ──────────── NonIP ───────── BadIP ───────── Activity ───────────────┐
│ lo          1028871         1028871         0               0               8332,40 kbits/sec       │
│ eth0                 517718          517718         0               0               5702,40 kbits/sec       │
│ eth1                  27046           27046         0               0                 32,40 kbits/sec       │
│                                                                                                     │
 lo  : 100%
 eth0:  68%
 eth1: 0.4%

 IPTraf
┌ Iface ─────── Total ───────── IP ──────────── NonIP ───────── BadIP ───────── Activity ───────────────┐
│ lo          14598637        14598637        0               0               3222,40 kbits/secc      │
│ eth0                 5860088         5860088        0               0               1858,00 kbits/sec       │
│ eth1                 5151836         5151836        0               0                 30,20 kbits/sec       │
│                                                                                                     │
 lo  : 100%
 eth0:  58%
 eth1:   1%

 IPTraf
┌ Iface ─────── Total ───────── IP ──────────── NonIP ───────── BadIP ───────── Activity ───────────────┐
│ lo          15455154        15455154        0               0               2382,00 kbits/sec       │
│ eth0                 6364436         6364436        0               0               1344,20 kbits/sec       │
│ eth1                 5179883         5179883        0               0                 37,00 kbits/sec       │
│                                                                                                     │
 lo  : 100%
 eth0:  56%

 eth1: 1.6%

 IPTraf
┌ Iface ─────── Total ───────── IP ──────────── NonIP ───────── BadIP ───────── Activity ───────────────┐
│ lo          576846          576846          0               0               3115,40 kbits/sec       │
│ eth0                264486          264486          0               0               1225,40 kbits/sec       │
│ eth1                 14809           14809          0               0                 33,80 kbits/sec       │
│                                                                                                     │
 lo  : 100%
 eth0:  39%
 eth1:   1%

Tests aus 2012

verwendete Anleitungen:

Server

Installation:

# aptitude install glusterfs-server
Paketlisten werden gelesen... Fertig
Abhängigkeitsbaum wird aufgebaut       
Status-Informationen einlesen... Fertig
Lese erweiterte Statusinformationen      
Initialisiere Paketstatus... Fertig
Die folgenden NEUEN Pakete werden zusätzlich installiert:
  glusterfs-client{a} glusterfs-server libglusterfs0{a} libibverbs1{a} 
0 Pakete aktualisiert, 4 zusätzlich installiert, 0 werden entfernt und 0 nicht aktualisiert.
Muss 1.458kB an Archiven herunterladen. Nach dem Entpacken werden 4.477kB zusätzlich belegt sein.
Wollen Sie fortsetzen? [Y/n/?]

# glusterfs --version
glusterfs 3.0.2 built on Mar 23 2010 00:24:16
Repository revision: v3.0.2
Copyright (c) 2006-2009 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.

Partition:

# lvcreate -L 100G -n glusterfs data
# mkfs -L glusterfs -t ext4 /dev/data/glusterfs
# mkdir /glusterfs
# echo 'LABEL=glusterfs    /glusterfs    ext4    defaults    0 2' >> /etc/fstab
# mount /glusterfs

Konfigurationsdatei:

# cp /etc/glusterfs/glusterfsd.vol /etc/glusterfs/glusterfsd.vol_orig
# vi /etc/glusterfs/glusterfsd.vol

### file: server-volume.vol.sample

#####################################
###  GlusterFS Server Volume File  ##
#####################################

#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line.
### - Multiple values to options will be : delimitted.
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.

volume posix
  type storage/posix
  option directory /glusterfs
end-volume

volume locks
  type features/locks
  subvolumes posix
end-volume

volume brick
  type performance/io-threads
  option thread-count 8
  subvolumes locks
end-volume

volume server
  type protocol/server
  option transport-type tcp
  
### Erlaubnis für den Client (Client-IPs oder Client-Netz angeben)
  #option auth.addr.brick.allow * # Allow access to "brick" volume
  #option auth.addr.brick.allow 192.168.0.4,192.168.0.5
  option auth.addr.brick.allow 192.168.3.*
  
  subvolumes brick
end-volume

Server-Start:

# /etc/init.d/glusterfs-server start
# netstat -an | fgrep 6996
tcp        0      0 0.0.0.0:6996            0.0.0.0:*               LISTEN

Client

Installation:

# aptitude install glusterfs-client
Paketlisten werden gelesen... Fertig
Abhängigkeitsbaum wird aufgebaut       
Status-Informationen einlesen... Fertig
Reading extended state information      
Initializing package states... Fertig
The following NEW packages will be installed:
  glusterfs-client libglusterfs0{a} libibverbs1{a} 
0 packages upgraded, 3 newly installed, 0 to remove and 14 not upgraded.
Need to get 1.300kB of archives. After unpacking 4.215kB will be used.
Do you want to continue? [Y/n/?]

Konfiguration:

# mkdir /glusterfs
# echo '/etc/glusterfs/glusterfs.vol  /glusterfs  glusterfs  defaults  0  0' >> /etc/fstab
# cp /etc/glusterfs/glusterfs.vol /etc/glusterfs/glusterfs.vol_orig
# vi /etc/glusterfs/glusterfs.vol

### file: client-volume.vol.sample

#####################################
###  GlusterFS Client Volume File  ##
#####################################

#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line. 
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.

volume remote1
  type protocol/client
  option transport-type tcp
  option remote-host server01.idstein.victorvox.net
  option remote-subvolume brick
end-volume

volume remote2
  type protocol/client
  option transport-type tcp
  option remote-host server02.idstein.victorvox.net
  option remote-subvolume brick
end-volume

volume replicate
  type cluster/replicate
  subvolumes remote1 remote2
end-volume

volume writebehind
  type performance/write-behind
  option window-size 1MB
  subvolumes replicate
end-volume

volume cache
  type performance/io-cache
  option cache-size 512MB
  subvolumes writebehind
end-volume

mounten

mount-Variante 1

mounten:

# glusterfs -f /etc/glusterfs/glusterfs.vol /glusterfs

mount-Variante 2

mounten:

# mount -t glusterfs /etc/glusterfs/glusterfs.vol /glusterfs

mount-Variante 3

einmalige Vorbereitung:

# mkdir /glusterfs
# echo '/etc/glusterfs/glusterfs.vol  /glusterfs  glusterfs  defaults  0  0' >> /etc/fstab

mounten:

# mount /glusterfs

Tests

Server

keine Daten nun auf Server01:

benutzer@server01:~# ls -lha /glusterfs
total 120K
drwxr-xr-x 3 root root 4,0K 2012-04-23 16:27 .
drwx------ 2 root root  16K 2012-04-19 15:10 lost+found

keine Daten nun auf Server02:

benutzer@server02:~# ls -lha /glusterfs
total 120K
drwxr-xr-x 3 root root 4,0K 2012-04-23 16:27 .
drwx------ 2 root root  16K 2012-04-19 15:10 lost+found

Client

keine Daten im Cluster:

benutzer@client01:~# ls -lha /glusterfs
total 120K
drwxr-xr-x 3 root root 4,0K 2012-04-23 16:27 .
drwx------ 2 root root  16K 2012-04-19 15:10 lost+found

Testdatei in den Cluster schreiben:

benutzer@client01:~# echo "Test OK" > /glusterfs/test.txt

Testdatei liegt jetzt im Cluster:

benutzer@client01:~# ls -lha /glusterfs
total 120K
drwxr-xr-x 3 root root 4,0K 2012-04-23 16:27 .
drwx------ 2 root root  16K 2012-04-19 15:10 lost+found
-rw-r--r-- 1 root root    8 2012-04-23 16:25 test.txt

Server

Testdatei liegt nun auf Server01:

benutzer@Server01:~# ls -lha /glusterfs
total 120K
drwxr-xr-x 3 root root 4,0K 2012-04-23 16:27 .
drwx------ 2 root root  16K 2012-04-19 15:10 lost+found
-rw-r--r-- 1 root root    8 2012-04-23 16:25 test.txt

Testdatei liegt nun auf Server02:

benutzer@Server02:~# ls -lha /glusterfs
total 120K
drwxr-xr-x 3 root root 4,0K 2012-04-23 16:27 .
drwx------ 2 root root  16K 2012-04-19 15:10 lost+found
-rw-r--r-- 1 root root    8 2012-04-23 16:25 test.txt

Eigenart

Die einfache Konfigurierbarkeit wird durch eine ungewohnte Eigenart von GlusterFS erkauft.

Wenn ein Server (Cluster-Knoten) vom Client aus nicht erreichbar ist, dann werden die Daten nur auf den erreichbaren Server (Cluster-Knoten) geschrieben.

Ist der zweite Server (Cluster-Knoten) wieder erreichbar, werden die Daten erst nach dem nächsten Clientzugriff auf den Cluster synchronisiert!

Hierbei kann es aber folgenden Effekt geben: In diesem Beispiel haben wir einen Cluster mit zwei Knoten und zwei Client-Rechner. Client01 legt Daten im Cluster ab, jetzt stirbt der "Cluster-Knoten 2" und der Client01 löscht einige Daten aus dem Cluster. Jetzt wird der "Cluster-Knoten 2" wieder Betriebsbereit und Client02 greift auf die Daten zu, die Client01 in Bearbeitung hatte, dabei stellt er fest, dass nicht alle Daten auf beiden Cluster-Knoten vorhanden sind und will sie wieder synchronisieren. Dabei stellt er wenigstens einige der zuvor gelöschten Daten wieder her. Worann das liegt, dass nicht immer alle Daten wieder hergestellt werden, weiß ich nicht. Möglicherweise kann das daran liegen, dass die Uhren der beiden Cluster-Knoten nicht synchron gehen…

Der Client entscheidet welchen Daten er vertraut und welche Daten er nutzt. Das erspart dem Administrator die Konfiguration von Heatbeat und den mabuellen Eingriff bei DRBD, wenn mal beide Knoten gleichzeitig unten waren. Auch ein Clusterschwenk mit der daraus resultierenden Ausfallzeit ist bei diesem Aufbau nicht nötig.