Here is the story how I managed to install a 2 node glusterfs on CentOS and one client for test purposes.
In my case the hostnames and the IPs were:
192.168.183.235 s1
192.168.183.236 s2
192.168.183.237 c1
Append these to the end of /etc/hosts to make sure that simple name resolution will work.
Execute the followings on both servers.
rpm -ivh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm
wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.5/CentOS/glusterfs-epel.repo
yum -y install glusterfs glusterfs-fuse glusterfs-server
It's no need to install any of samba packages if you don't intend to use smb.
systemctl enable glusterd.service
Created symlink from /etc/systemd/system/multi-user.target.wants/glusterd.service to /usr/lib/systemd/system/glusterd.service.
Both servers had a second 20G capacity disk named sdb. I created two LV's for two bricks.
[root@s2 ~]# lvcreate -L 9G -n brick2 glustervg
Logical volume "brick2" created.
[root@s2 ~]# lvcreate -L 9G -n brick1 glustervg
Logical volume "brick1" created.
[root@s1 ~]# vgcreate glustervg /dev/sdb
Volume group "glustervg" successfully created
[root@s1 ~]# lvcreate -L 9G -n brick2 glustervg
Logical volume "brick2" created.
[root@s1 ~]# lvcreate -L 9G -n brick1 glustervg
Logical volume "brick1" created.
[root@s2 ~]# pvdisplay
--- Physical volume ---
PV Name /dev/sdb
VG Name glustervg
PV Size 20.00 GiB / not usable 4.00 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 5119
Free PE 511
Allocated PE 4608
PV UUID filZyX-wR7W-luFX-Asyn-fYA3-f7tf-q4xGyU
[...]
[root@s2 ~]# lvdisplay
--- Logical volume ---
LV Path /dev/glustervg/brick2
LV Name brick2
VG Name glustervg
LV UUID Rx3FPi-S3ps-x3Z0-FZrU-a2tq-IxS0-4gD2YQ
LV Write Access read/write
LV Creation host, time s2, 2016-05-18 16:02:41 +0200
LV Status available
# open 0
LV Size 9.00 GiB
Current LE 2304
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:3
--- Logical volume ---
LV Path /dev/glustervg/brick1
LV Name brick1
VG Name glustervg
LV UUID P5slcZ-dC7R-iFWv-e0pY-rvyb-YrPm-FM7YuP
LV Write Access read/write
LV Creation host, time s2, 2016-05-18 16:02:43 +0200
LV Status available
# open 0
LV Size 9.00 GiB
Current LE 2304
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:4
[...]
[root@s1 ~]# lvdisplay
--- Logical volume ---
LV Path /dev/glustervg/brick2
LV Name brick2
VG Name glustervg
LV UUID 7yC2Wl-0lCJ-b7WZ-rgy4-4BMl-mT0I-CUtiM2
LV Write Access read/write
LV Creation host, time s1, 2016-05-18 16:01:56 +0200
LV Status available
# open 0
LV Size 9.00 GiB
Current LE 2304
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:2
--- Logical volume ---
LV Path /dev/glustervg/brick1
LV Name brick1
VG Name glustervg
LV UUID X6fzwM-qdRi-BNKH-63fa-q2O9-jvNw-u2geA2
LV Write Access read/write
LV Creation host, time s1, 2016-05-18 16:02:05 +0200
LV Status available
# open 0
LV Size 9.00 GiB
Current LE 2304
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:3
[...]
[root@s1 ~]# mkfs.xfs /dev/glustervg/brick1
meta-data=/dev/glustervg/brick1 isize=256 agcount=4, agsize=589824 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=2359296, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@s1 ~]# mkfs.xfs /dev/glustervg/brick2
meta-data=/dev/glustervg/brick2 isize=256 agcount=4, agsize=589824 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=2359296, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@s1 ~]# mkdir -p /gluster/brick{1,2}
[root@s2 ~]# mkdir -p /gluster/brick{1,2}
[root@s1 ~]# mount /dev/glustervg/brick1 /gluster/brick1 && mount /dev/glustervg/brick2 /gluster/brick2
[root@s2 ~]# mount /dev/glustervg/brick1 /gluster/brick1 && mount /dev/glustervg/brick2 /gluster/brick2
Add the following to a newline in both /etc/fstab:
/dev/mapper/glustervg-brick1 /gluster/brick1 xfs rw,relatime,seclabel,attr2,inode64,noquota 0 0
/dev/mapper/glustervg-brick2 /gluster/brick2 xfs rw,relatime,seclabel,attr2,inode64,noquota 0 0
[root@s1 etc]# systemctl start glusterd.service
Making sure:
[root@s1 etc]# ps ax|grep gluster
1010 ? Ssl 0:00 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO[root@s1 etc]# gluster peer probe s2
peer probe: success.
[root@s2 etc]# gluster peer status
Number of Peers: 1
Hostname: 192.168.183.235
Uuid: f5bdc3f3-0b43-4a83-86c1-c174594566b9
State: Peer in Cluster (Connected)
[root@s1 etc]# gluster pool list
UUID Hostname State
01cf8a70-d00f-487f-875e-9e38d4529b57 s2 Connected
f5bdc3f3-0b43-4a83-86c1-c174594566b9 localhost Connected
[root@s1 etc]# gluster volume status
No volumes present
[root@s2 etc]# gluster volume infoNo volumes present
[root@s1 etc]# mkdir /gluster/brick1/mpoint1
[root@s2 etc]# mkdir /gluster/brick1/mpoint1
[root@s1 gluster]# gluster volume create myvol1 replica 2 transport tcp s1:/gluster/brick1/mpoint1 s2:/gluster/brick1/mpoint1
volume create: myvol1: failed: Staging failed on s2. Error: Host s1 is not in 'Peer in Cluster' state
Ooooops....
[root@s2 glusterfs]# ping s1ping: unknown host s1I forgot to check name resolution. When i fixed this and tried to create it again, i got:
[root@s1 glusterfs]# gluster volume create myvol1 replica 2 transport tcp s1:/gluster/brick1/mpoint1 s2:/gluster/brick1/mpoint1
volume create: myvol1: failed: /gluster/brick1/mpoint1 is already part of a volume
WTF ??
[root@s1 glusterfs]# gluster volume get myvol1 all
volume get option: failed: Volume myvol1 does not exist
[root@s1 glusterfs]# gluster
gluster>
exit global help nfs-ganesha peer pool quit snapshot system:: volume
gluster> volume
add-brick bitrot delete heal inode-quota profile remove-brick set status tier
attach-tier clear-locks detach-tier help list quota replace-brick start stop top
barrier create get info log rebalance reset statedump sync
gluster> volume l
list log
gluster> volume list
No volumes present in cluster
That's odd! Hmm. I thought it'd work:
[root@s1 /]# rm /gluster/brick1/mpoint1
[root@s1 /]# gluster volume create myvol1 replica 2 transport tcp s1:/gluster/brick1/mpoint1 s2:/gluster/brick1/mpoint1volume create: myvol1: success: please start the volume to access data
[root@s1 /]# gluster volume list
myvol1
Yep. Success. Phuhh.
[root@s1 /]# gluster volume start myvol1
volume start: myvol1: success
[root@s2 etc]# gluster volume list
myvol1
[root@s2 etc]# gluster volume status
Status of volume: myvol1
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick s1:/gluster/brick1/mpoint1 49152 0 Y 2528
Brick s2:/gluster/brick1/mpoint1 49152 0 Y 10033
NFS Server on localhost 2049 0 Y 10054
Self-heal Daemon on localhost N/A N/A Y 10061
NFS Server on 192.168.183.235 2049 0 Y 2550
Self-heal Daemon on 192.168.183.235 N/A N/A Y 2555
Task Status of Volume myvol1
------------------------------------------------------------------------------
There are no active volume tasks
[root@s1 ~]# gluster volume create myvol2 s1:/gluster/brick2/mpoint2 s2:/gluster/brick2/mpoint2 force
volume create: myvol2: success: please start the volume to access data
[root@s1 ~]# gluster volume start myvol2
volume start: myvol2: success
[root@s1 ~]# gluster volume info
Volume Name: myvol1
Type: Replicate
Volume ID: 633b765b-c630-4007-91ca-dc42714bead4
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: s1:/gluster/brick1/mpoint1
Brick2: s2:/gluster/brick1/mpoint1
Options Reconfigured:
performance.readdir-ahead: on
Volume Name: myvol2
Type: Distribute
Volume ID: ebfa9134-0e6a-40be-8045-5b16436b88ed
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: s1:/gluster/brick2/mpoint2
Brick2: s2:/gluster/brick2/mpoint2
Options Reconfigured:
performance.readdir-ahead: on
On the client:
[root@c1 ~]# wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/glusterfs-epel.repo
[...]
[root@c1 ~]# yum -y install glusterfs glusterfs-fuse
[....]
[root@c1 ~]# mkdir /g{1,2}
[root@c1 ~]# mount.glusterfs s1:/myvol1 /g1
[root@c1 ~]# mount.glusterfs s1:/myvol2 /g2
[root@c1 ~]# mount
[...]
s1:/myvol1 on /g1 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
s2:/myvol2 on /g2 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@c1 ]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 28G 1.1G 27G 4% /
devtmpfs 422M 0 422M 0% /dev
tmpfs 431M 0 431M 0% /dev/shm
tmpfs 431M 5.7M 426M 2% /run
tmpfs 431M 0 431M 0% /sys/fs/cgroup
/dev/sda1 494M 164M 331M 34% /boot
tmpfs 87M 0 87M 0% /run/user/0
s1:/myvol1 9.0G 34M 9.0G 1% /g1 [9G,9G because of replicating (aka RAID1 over network))
s2:/myvol2 18G 66M 18G 1% /g2 (9G+9G because of distributing (aka JBOD over network))
What is the difference between distributing and striping? Here are two short sniplets from glusterhacker blog:
Distribute : A distribute volume is one, in which all the data of
the volume, is distributed throughout the bricks. Based on an
algorithm, that takes into account the size available in each brick, the
data will be stored in any one of the available bricks.
[...] The default volume type is distribute, hence my myvol2 got distributed.
Stripe: A stripe volume is one, in which the data being stored
in the backend is striped into units of a particular size, among the
bricks. The default unit size is 128KB, but it's configurable. If we
create a striped volume of stripe count 3, and then create a 300 KB file
at the mount point, the first 128KB will be stored in the first
sub-volume(brick in our case), the next 128KB in the second, and the
remaining 56KB in the third. The number of bricks should be a multiple
of the stripe count.
The very useable official howto is here.
Performance test, split brain, to be continued....