Categories
General NSP Xen

Xenserver Enterprise + Fibrechannel storage = live migration

I finally got round to booking time at the IBM demo center to have a fiddle with Xen Enterprise on shared storage. They’ve got a good range of entry-level kit at the demo center, but the important bits (for me anyway) were a Bladecenter H chassis with some HS21 blades fitted with Fibrechannel HBAs, and a DS3400 Fibrechannel storage array.

The IBM DS3000 series of arrays looks really promising. There’s three current variants – SAS, iSCSI and Fibrechannel attached, along with an EXP3000 expansion shelf. All four systems will scale to 48 SAS (14.4 TB) drives over 4 shelves, and an IBM firmware announcement I read online the other day strongly suggests they will support SATA drives in the very near future (36 TB using 750 GB disks). And the DS3000 series are cheap. Performance is pretty good, but you can only stack the controller with 1GB of cache – which really highlights the “entry level” bit of these SANs. The SAS attached option is compelling as well – you get SAN functionality, very close to FC levels of performance (3 Gbps peak HBA throughput for SAS compared to 4Gbps for FC), at a fraction of the cost. The DS3200 allows 6 single-connect or 3 dual-connect hosts, and as soon as IBM get round to releasing a SAS switch, that limit will disappear.

I’d used the DS3400 management software in the past, so setting up a LUN within the existing array took about 5 minutes, including creating the host to WWN mappings; and I had two Xenserver Enterprise installs already set up on a matched pair of blades from a previous attempt at this with the DS3300.

Xenserver Enterprise supports shared storage, but as of version 4.0.1, it only supports NFS or iSCSI shared storage officially. I’d had problems getting the openiscsi software iSCSI stack that Xenserver ships with to communicate successfully with the DS3300 however, and ran out of time. On the other hand, FC shared storage is just not supported at all yet. There’s a forum article explaining Xensource’s position on the support, which also links to a knowledge base article describing how shared storage works in Xenserver. The article was pulled with the release of 4.0.1:

“We took it down because, with the level of testing and integration we could do by initial release of 4.0, we couldn’t be any further along than a partial beta. There are business reasons we couldn’t ship a product as released while describing some of the features as “beta”, and it is hypocritical for us to officially describe a way of using the product yet describe that as “unsupported.” For that reason, until we are ready to release supported shared Fibre Channel SRs, we’re not going to put the KB article back up.”

The article describes the overall setup you have to have in place to have FC shared storage working – namely array, LUN and zoning management on the SAN and HBA, and locating the device node that maps to the corresponding LUN. And then there are two commands to run, on one node of your Xenserver pool, and your shared storage system is up and running.

At this point it took about another 5 minutes to create a Debian Etch domU, and verify that live migration between physical hosts was indeed working.

I set up a slightly more complicated scenario, which I’m planning on using with some other software to demonstrate a HA / DR environment, but which also enabled me to do one of those classic cheesy live migration “but it still works” tests – I connected a Wyse thinclient through to a terminal server domU, migrated the TS between physical hosts, and demonstrated that the terminal session stayed up. A ping left running during this time experienced a brief bump of about 100 ms during the final cutover, but that’s possible attributable forwarding-path updates in the bridge devices on the Xenserver hosts. Either way it works well. Moving a win2k3 terminal server took about 15 seconds on the hardware I was working with.
Overall, I’m quietly encouraged by this functionality and it’s ease of set up. I’m perhaps a bit underwhelmed by it too – it ended up being such a nonevent to get it working, that I’m annoyed I didn’t get round to setting up the demonstration months ago.

Categories
General NSP WLUG

Benchmarks of an Intel SRSC16 RAID controller

One of our clients gave us an Intel server with an Intel SRSC16 SATA RAID controller and a 500 GB +hotspare RAID1 set up on it, to install XenServer Express 4.0.1 system. While building the system up for him, I noticed abysmal write perfomance. It was taking around 29 minutes to install a guest from a template, a process which basically involves creating and formatting a filesystem and unpacking a .tar.bz2 file into it. Inspection of the system revealed that the controller lacked a battery backup unit (BBU), and thus the writeback cache was disabled. Also, the firmware on the controller disabled the on-disk cache as well, and the controller listed disk access speed at 1.5Gbps, which I’m presuming means it was operating in SATA-1 mode, so no NCQ either. The controller has 64MB of cache.

I persuaded the customer to buy the BBU for the system, and then ran some quick bonnie++ benchmarks, which I know aren’t the best benchmark in the world, but show a good indication of relative performance gains. Results are as follows:

Note: I didn’t do the tests right either – not specifying a number of blocks of files to stat results in those tests completing too soon for bonnie to come up with an answer. So, the output below only really shows throughput tests, as the sequential create/random create tests all completed too soon. Changing the disk cache settings requires a reboot into the BIOS mode configuration tool, so I’ve avoided doing this too many times. Changing the controller cache settings can be done on the fly.

[code]

RAID controller writeback cache disabled, disk cache disabled:

Version  1.03      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.loca 512M  1947   4  2242   0  1113   0 10952  18 36654   0 169.7   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
localhost.locald 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ ++

RAID controller writeback cache enabled, disk cache disabled:

Version  1.03      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.locald 512M  7938  19  9195   1  4401   0 28823  50 41961   0 227.0   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
localhost.locald 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++

RAID controller writeback cache disabled, disk cache enabled:

Version  1.03      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.loca 512M 19861  47 17094   1  9870   0 28484  47 41167   0 243.8   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
localhost.loca  16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++

RAID controller writeback cache enabled, disk cache enabled:

Version  1.03      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.locald 512M 38633  95 40436   4 15547   0 32045  54 42946   0 261.4   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
localhost.locald  16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
[/code]

Enabling the only the controller write-back cache (64MB in this case) roughly quadrupled the write throughput in all cases. Enabling only the disk cache provided nearly 8 times the performance on it’s own. And enabling both together increased write throughput by about a factor of 20.   I suspect the tests weren’t large enough to actually tax the cache systems on the disk or controller however, as I was running them in a Xen domU with only 256 MB of ram, and actually just wanted some quick results.

I know they aren’t really representative of anything, but here’s a test that is semi-representative: Installing another copy of the Xen domU via a template took 2minutes 55 seconds with disk cache enabled, and 2 minutes 30 seconds with disk cache and controller cache enabled (I didn’t test this with just controller cache enabled as that would have required a reboot and manual intervention, and I wasn’t onsite at that point).  Prior to enabling the disk cache and controller cache, this was taking nearly 30 minutes.

While the above shows that a combination of the controller write-back cache and the disk cache shows the best improvement, merely enabling the disk cache on it’s own had the biggest single effect. Of course, the disk cache isn’t backed up by a battery, so there’s the risk of losing the data that is in the disk cache at the time. The Intel documentation for the controller implied that this is limited to the sector that is being written at the point of powerfailure.

When I get some free time and a SCSI or SAS server, I’ll do some similar benchmarks up for that.