Gluster: What happens if …?

This article lists some situations you might find yourself in with Gluster and what you can expect. Basically I’ve tried to break, push and put Gluster into impossible situations to see what happens so you don’t have to. Just read this article and you’ll save yourself the pain.

I will add more scenarios as i think of them and have time to test them.

What happens if i mess it all up and want to start again?

Summary: You can clear out your gluster environment and start again easily. Consider backing up your data before following through with these steps because we’re about to blow it all away.

See the details...

To start again removing all pre-existing Gluster information and settings, i did the following:

I logged into one of the Gluster servers (server1) and ran the following commands:

Where “agixvol1” is my volume name and there are 3 servers in my pool with 1 brick each. And i’m issuing these commands on server1.

gluster volume stop agixvol1
gluster volume delete agixvol1
gluster peer detach server2
gluster peer detach server3

On each server, issue these commands:

rm -rf /mnt/brick1/agixvol1/*
rm -rf /mnt/brick1/agixvol1/.glusterfs
rm -rf /var/lib/glusterd
systemctl stop glusterd

What happens if i take one of the three Glustger server (in replicate mode) off the network, make changes to files on that server while it’s off the network and then re-join that server to the network?

Summary: The most recently changed file overwrites the other files regardless of where they where modified. (Note that i suspect this isn’t always the case.) In my test below, i modified files on an off-line Gluster server (off the network) and then re-joined it to the network and the found that the modified file from the off-line server overwrote the same file on the other Gluster servers when it was brought back online.

See the details...

To test this i did the following:

Logged into a client, created a file on the Gluster storage mount called “testfile” and put the string “this is the original file” inside that file.
Disconnected one of the servers from the network, made changes to the “testfile” changing it’s contents to “this is a modified file” on that off-line server.
Reconnected the off-line server to the network.
The modified file from the off-line server overwrite the files on the on-line servers.

What happens to files that existed on a brick (in a directory that will be used as a brick) before the brick becomes part of a volume?

Summary: Odd behavior. The files that existed on the brick prior to that brick joining a volume (the pre-existing files) are included in replication but only as meta-data and not the real file. Files created post-volume exist and replicate as normal.

See the details...

To test this i did the following:

Before the Gluster pool was created, I created a set of files on one of the bricks that would become part of the pool.
I created a pool of three servers (2 replicas and an arbiter) and added those servers to a newly created volume.
I checked and the files had not copied over from server1 (where the files existed).
However, i mounted server2 from a client and checked its disk and found the files were accessible from there.
I logged into server2 and checked the brick directly but didn’t see the files that i expected to be there.
I created a new file on sever1 directly on the brick and it too was accessible only via the mount (from the client to server2) but not directly on server2’s brick.
I removed server1 by stopping glusterd.
All files became accessible via the mount (from the client to server2). But directly on the brick the files old (pre-existing) files were not listed but the new file (post volume) did exist.
I can’t modify or even open the pre-existing files but i can on post-volume files. The pre-exsiting files all appear to be there but it’s likely that only their meta data is there and not the actual files.

The bottom line. Do it right, create files after the volume is created and working.

What happens if i have a distributed replicated pool of Gluster servers and take one of the servers off-line (off the network) leaving the remaining three? And then take the third server off-line.

Summary: This works fine as you’d expect. The interesting thing is how the files are placed during their creation and where they end up during the partial outage. Also it was interesting to see a pause (a hang of about 20 seconds) when i took the servers off-line even though my client was connected to a different server (one that remained on).

See the details...

Note: The Gluster pool was created using the following command. The bricks are placed on servers in the order that they appear in the command.

gluster volume create agixvol1 replica 2 transport tcp server1:/mnt/brick1/agixvol1 server2:/mnt/brick1/agixvol1 server3:/mnt/brick1/agixvol1 server4:/mnt/brick1/agixvol1 force

In the above example, server1 and server2 replicate (are mirrors of each other) while server3 and server4 replicate with each other.

I logged into server1 and created a mount point to the Gluster volume.
I created 6 files called “myfileX” where X increments from 1 to 6.
Files 3 and 5 ended up on server1 and server2.
Files 1, 2, 4 and 6 ended up on server3 and server4.

They aren’t evenly spread. I deleted the files and recreated them to see how consistent their placement is. Here’s the outcome:

Files 3 and 4 ended up on server1 and server2.
Files 1, 2, 5 and 6 ended up on server3 and server4.

Knowing how the files are laid out, we can now start testing the removal (take off-line) of a server or two.

I took server4 off-line (off the network) and interestingly server1’s mount point hung. I did not expect the pause of 20 seconds (give or take) because a) the mount was “from” and “to” the local system (server1), and b) surely Gluster would ensure a nice user experience and not hand when it doesn’t need to. After the hang ended, i noticed all the same files exactly how they should be. Obviously there was no longer any redundancy for the files stored on server3 (and previously server4).
I took server3 off-line (off the network). From the mount on server1 i was no longer able to see the files that were stored on server3 and server4.
I re-joined the network on both server3 and server4 and the files returned as you’d expect.

What happens if i take two of the three servers (in replicate mode) off-line (off the network), create a large file on each of those off-line servers taking up more than 50% of the available disk space, and then re-join those two servers to the network. It’s not possible to replicate both files to the other servers as there isn’t enough room to do it.

Summary: This was unrecoverable for me. Read on to see my troubleshooting steps.

See the details...

Note: All Gluster servers have a single brick of 2GB.

To test this i did the following:

Disconnected Gluster servers server2 and server3 from the network leaving only server1 online.
Create a 1.5GB file on server2 in the Gluster brick called “server2.large” via the brick directly and not via the a point. This file takes up the majority of the 2GB filesystem.
Create a 1.5GB file on server3 in the Gluster brick called “server3.large” via the brick directly and not via the a point. This file takes up the majority of the 2GB filesystem.
Re-joined both server2 and server3 to the network.
Neither of the two large “serverX.file” files were replicated to any other servers.

At this point i did some further tests to see how broken this is:

Created a small file on both server2 and server3.
Those files were not replicatedto the other servers.
At this stage it looks pretty broken. However, when i created a file on server1, that file was replicated to the other two servers (server2 and server3).

At this point files can replicate from server1 to server2 and server3 but not the other way around. I would conclude that this is due to the large file blocking the replication queue (if there is one).

I removed the large file from server2 while it’s still on the network.
The large file on server3 did NOT replicate to any other servers. This is unexpected. I would have expected the replication of a single file to go through fine but this didn’t happen.
Again i tested creating a small file on server1 and it was immediately replicated to the other two servers.
I rebooted server2 but that did not resolve the one-way replication issue.
I rebooted server3 but that did not resolve the one-way replication issue.
I rebooted server1 but that did not resolve the one-way replication issue.
The command “gluster volume heal VOLUMENAME info” was unsuccessful in resolving this issue.
The command “gluster volume heal VOLUMENAME info split-brain” was unsuccessful i resolving this issue.
The command “gluster volume sync serverX” was unsuccessful in resolving this issue.
I removed and re-added server2 and server3 to the Gluster pool was unsuccessful in resolving this issue.

At this point i give up. I may come back to this at a later time.

Similar Posts:

Leave a Reply Cancel reply