CentOS 5 HA Cluster Speeds and Feeds
Last week I posted about our moving to replace our trusty Tru64 TruCluster NAS server with a new CentOS 5 based NAS solution. I said then that I would peel back the covers a bit and show the test results from our qualifcation runs. In fact, back a few posts I said we were going to be open about this whole project, and here is that promised openess, in all its geeky glory.
This post is largely not my work , but that of Dan Goetzman, the man with the NAS plan that did all this work. This one post actually covers literally months of work in planning and testing and gathering results. The only changes I have made to Dan's post to our internal Wiki are that I deleted two graphs (because I don't know how to post graphics here), and removed systems names in favor of system type information: anyone reading this is not going to care if we named a Solaris system “Yoda” or “Shuttlebay”: it is still a Solaris system not matter what geek-space name we picked for it. Hey!, we're geeks: We admit it.
My deep thanks to Dan for letting me use his work like this. Truth be told, this whole blog would be a much shallower, less technical thing if it was not for his work over the years. He keeps me honest. He gives me ideas, data, time, and outstanding work. I could not ask for more from anyone.
Server - Sun X2200
HW = (3) Sun X2200m2
Data Disk = Apple XServe Raid, (28) 750GB SATA disks, (4) 3.5 TB RAID LUN's (Raid 5, 6+1)
OS = CentOS 5 with Cluster Suite using GFS filesystems for user data.
Connectathon Version = cthon04
NFS test results
Test 1 - Basic function
iozone test - Pass
locktest - Pass
Test 2 - Full function
|
OS |
Basic |
Basic |
Basic |
Basic |
Lock |
Lock |
Lock |
Lock |
CPT |
Client |
Notes |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
Solaris 9 |
Pass |
Pass |
Pass |
Pass |
Pass |
Pass |
Pass |
Pass |
Pass |
Superman |
|
|
Solaris 8 |
Fail(1) |
Fail(1) |
Pass |
Pass |
Fail(1) |
Fail(1) |
Pass |
Pass |
Pass |
Gas |
|
|
Solaris 7 |
|||||||||||
|
HP-UX 11.00 |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass |
Hercules |
|
|
AIX 5.1.0 |
Pass(2) |
Pass(2) |
Pass(2) |
Pass |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass |
Perfaix02 |
|
|
Tru64 5.1B |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Fail(3) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass |
Thing |
|
|
Linux 2.6.8-1.521 |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass |
Smore |
|
|
Linux 2.4.21-4.EL |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass(2) |
Pass |
Putter |
Notes:
Permission denied on mounting due to known Solaris NFSV2 with acl problem.
Passed with warnings. Typically client side implementation issues with locking.
Lock test failed to complete due to a coredump.
Lock test failed in "non native 64 bit mode" only
Test 3 - Client platform
Same as "basic function test", run on each client/platform.
Results recorded in CPT column in table above.
Test 4 - Throughput
About 70 MB/s peak sustained write rate measured at the server.
Clients used;
SunFire V440, Solaris 8, 1000 BT, NFSV3_TCP, iozone -i 0 x 3 streams
SunFire v880, Solaris 9, 1000 BT, NFSV3_TCP, iozone -i 0 x 3 streams
HP rx2600, HP-UX 11.23, 1000 BT, NFSV3_TCP, iozone -i 0 x 3 streams
Note: Each client ran 3 iozone streams for a total of 9
streams.
Note: I/O was directed to a single XSR Raid
controller.
Test 5 - Burn test
Pass - Multiple clients running a complete "iozone -a" pass concurrently.
Clients used;
Solaris system 1
Solaris system 2
Solaris system 3
Tru64
HP-UX
Note: The NFS UDP clients, Tru64 and HP-UX, were very slow. This is due to the very fast gigabit NFS server and slow 100BT clients.
Test 6 - Basic Tier 2 client test
As it is difficult to find a working compiler for some of the tier 2 clients, cthon04 was not used to test the NFS protocol. Instead a basic confidence NFS test was used.
Verify my $HOME automounts and is accessible
Copy contents of my $HOME to a test area on the NAS
Clients tested;
Dynix/PTX (Sequent)
OpenVMS using VMS/TCPIP
SCO (1)
SINIX/Reliant (2)
OSX
FreeBSD
Notes:
NFSV2 UDP dropped packet retry/timeout problems. Set r/wsize=1K to run tests.
cpio ran to completion, but with errors trying to reset the modification time (-m option to cpio)
CIFS test results
Test 1 - Basic function
iozone test - Pass
Test 2 - Client platform
Same as "basic function test", run on each client/platform.
NT 4.0 SP6a - Pass
Windows XP SP2 - Pass
Windows 2000 SP4 - Pass
Windows 2003 Server SP1 - Pass
OSX 10.4.10 - Pass
Test 3 - Throughput
About 90 MB/s peak sustained write rate measured at the server switch port.
Clients used;
Windows Server 2003 SP1, 1000 BT, iozone -t 4 -s 300m -r 32k -i0
Windows Server 2003 SP1, 1000 BT, iozone -t 4 -s 300m -r 32k -i0
Windows Server 2003 SP1, 1000 BT, iozone -t 4 -s 300m -r 32k -i0
Windows Server 2003 SP1, 1000 BT, iozone -t 4 -s 300m -r 32k -i0
Note: Each client ran 4 iozone streams for a total of 16
streams.
Note: I/O was spread across all 4 XSR Raid
Controllers.
Test 4 - Burn test
Use a select set of clients to run a full iozone -a pass.
iozone -a - Pass
Note: Used the same set of clients used for test #3 above.
High Availability Tests
Test 1 - "Graceful" Shutdown
Node#1 using a graceful shutdown.
Pass - No file
service outage detected
Start "iozone -a" (both UDP and TCP) on NFS clients
shutdown -h now - On node#1
clustat - On a surviving node shows NFS service was relocated to another node OK.
NFSV3-UDP "iozone -a" continues to run OK from a NFS client
NFSV3-TCP "iozone -a" continues to run OK from a NFS client
Power up and boot head#1
clustat - Shows NFS service recovered back to node#1
NFSV3-UDP "iozone -a" continues to run OK from a NFS client
NFSV3-UDP "iozone -a" continues to run OK from a NFS client
Test 2 - Power Cord "yank"
Power was interrupted to node#1 by "yanking" the power
cords.
Fail - NFS service fails to recover.
Problem: fence_ipmilan fails due to X2200 LOM card not available. Second fence method is defined (fence_brocade) but the fence daemon is not able to query via ccs_get to obtain the next fence method. This is a bug!
Syslog Messages:
fenced[2881]: agent "fence_ipmilan" reports: Rebooting machine @ IPMI:172.19.176.17...ipmilan: Failed to connect after 30 seconds Failed ccsd[2844]: process_get: Invalid connection descriptor received. ccsd[2844]: Error while processing get: Invalid request descriptor fenced[2881]: fence "rnd-fs01" failed fenced[2881]: fencing node "rnd-fs01"
Test 3 - Network Cable "yank"
Public network cable was "yanked" from node#1 while
testing.
Pass - No file service outage detected
Start "iozone -a" on NFS test clients
Yank network cable from node#1
Cluster detects node failure
Cluster fences failed node sucessfully, using fence_ipmilan
Cluster relocates NFS service to a surviving node
"iozone -a" on the test client continues after a short delay as expected
Reconnect network cable
Node#1 joins the cluster
NFS service remains on node#2
"iozone -a" continues to run OK
Additional Tests
Test - Filesystem "expand on the fly"
Increase the size of a filesystem while running a "iozone -a"
test from a NFS client.
- No file service outage detected
Start a "iozone -a" on a NFS client
Create a 100 GB "segment" using the admin GUI
Attach the new segment to the test filesystem using the admin GUI
"iozone -a" on the NFS client was not interrupted
Legato Backup and Restore Testing
Using Legato as the backup server.
Restore Tests
Single file restore - To a alternate path.
Sub tree restore - To a alternate path.
Entire volume restore - To a alternate path.
Single file restore - With a NT ACL defined.
There it is then. A pretty nifty box so far. We have migrated more data to it over the last week, and so far, so good.
I am on vacation in West Texas next week. We do not allow EMI out there, so I doubt I'll have anything new to post here. How much can one say about Linux or Open Source when one is surrounded by high desert and has no computer in reach?
I'm going to defer the posts about storage virtualization and the new mirror process until I have more time to do them justice. I also have a ton of new desktop stuff under way: I have been working with Mint 4.0 Beta, Fedora 8, OpenSUSE 10.3, and Ubuntu 7.10 for a few weeks now so when I finish up on the NAS series of posts I'll jump back in with some more there about the Enterprise Linux desktop. Hint: It works better all the time.
_____
tags:
