Categories
General linux Tool of the Week

Changing Putty’s right-click behaviour

Anyone who uses Putty a lot will inevitably accidentally paste something into their putty window due to Putty’s right-click to paste default behaviour, often resulting in pasting relatively embarassing content like SQL that should never be seen in daylight, or at least a customer’s root password.
However, I found this putty enhancement request today. Note at the bottom:
[code]
Update: as of 2003-11-21, there is a new mouse-handling option whereby the right button brings up a context menu containing a Paste option, rather than pasting directly.
[/code]

And sure enough, putty supports this. It also supports “xterm mode”, which is a more standard middle-click to paste mode. Solved!

Categories
General NSP

Looking up .local DNS names under OSX

My workplace uses a .local DNS suffix for all internal DNS, which of course causes problems when you’re running a system which uses any form of mdns – such as OSX or Ubuntu (or probably any modern Linux distro, I know SuSE had this problem about 6 years ago). The .local lookups fail, because mdns takes over. (Thanks John and Phil for reminding me of this). This shows up as resolution via host or dig working fine, as they make calls direct to your nameservers, but commands like ping failing, as it uses the NSS to do the lookup.

A quick bit of googling, and I found this gem on Apple’s website, and also this one on www.multicastdns.org. Apple’s suggested fix didn’t seem to work, but I suspect a reboot is required. I’ve applied the second one, and rebooted, and one of them is definitely working.

As an aside, this started with me wishing that it was possible to do per-domain resolver configuration. I initially gave up and set up dnsmasq which forward on requests to specific domains to specific servers, but then hit the mdns issue. This method looks very much like a per-domain resolver configuration however – it’s saying to use my local DNS server for .local lookups. I haven’t tested it, but it looks like it should support setting an arbitrary resolver for an arbitrary domain.

Categories
General linux NSP Tool of the Week

OSS Network Imaging / Install services

I’m very interested in the topic of network deployments of operating systems, specifically the various Microsoft OSs, as I can already install linux via PXEboot. There’s two main groups of software in this field – unattended or scripted installs, and imaged installs.

A while ago I found a tool called Unattended, which is a network based unattended installation tool for Windows. If it works, it looks very promising. It’s basically a DOS boot disk which mounts a network share and executes the windows installer. Simplicity. The basic install seems to require you to enter a number of responses to questions (such as administrator password, timezone and Microsoft product key), but the documentation explains how to customise the script to meet your business needs, including examples. Once the OS install is done, Unattended can be configured to install third party packages, as long as the packages (eg, MSI bundles) also support some level of unattended installation procedure.

Today I discovered Free Online Ghost, or FOG. FOG is network based computer imaging tool, designed to both read images from, and write images to hosts on your network. I’ve used tools like partimage in the past for exactly this purpose – creating a golden image of a lab machine and then reimaging the entire lab every couple of months to keep everything clean. FOG seems to be more polished than partimage does, as it claims to support things like creating AD accounts for the machine and so on.

The Unattended documentation includes a concise explanation of why the approach adopted by FOG, partimage, and commercial tools like Acronis and Ghost is bad, however I think this is really a case of using the right tool for the job. I can see a system like FOG being used with great success in a lab environment, or for periodic backup of individual host OSes to near-line storage, providing bare-metal restore functionality without requiring major investment in tape backup expansion. And Unattended makes a lot more sense for initial deployments, especially for my workplace, as we use such a wide range of hardware that an imaged install would be fairly problematic.

There are other commercial systems for doing these deployments of course – IBM Director, HP ICE, Citrix Provisioning Server are just a few of them, but these systems invariably make more sense for in-house deployment control.

Categories
General NSP Xen

Xenserver Enterprise + Fibrechannel storage = live migration

I finally got round to booking time at the IBM demo center to have a fiddle with Xen Enterprise on shared storage. They’ve got a good range of entry-level kit at the demo center, but the important bits (for me anyway) were a Bladecenter H chassis with some HS21 blades fitted with Fibrechannel HBAs, and a DS3400 Fibrechannel storage array.

The IBM DS3000 series of arrays looks really promising. There’s three current variants – SAS, iSCSI and Fibrechannel attached, along with an EXP3000 expansion shelf. All four systems will scale to 48 SAS (14.4 TB) drives over 4 shelves, and an IBM firmware announcement I read online the other day strongly suggests they will support SATA drives in the very near future (36 TB using 750 GB disks). And the DS3000 series are cheap. Performance is pretty good, but you can only stack the controller with 1GB of cache – which really highlights the “entry level” bit of these SANs. The SAS attached option is compelling as well – you get SAN functionality, very close to FC levels of performance (3 Gbps peak HBA throughput for SAS compared to 4Gbps for FC), at a fraction of the cost. The DS3200 allows 6 single-connect or 3 dual-connect hosts, and as soon as IBM get round to releasing a SAS switch, that limit will disappear.

I’d used the DS3400 management software in the past, so setting up a LUN within the existing array took about 5 minutes, including creating the host to WWN mappings; and I had two Xenserver Enterprise installs already set up on a matched pair of blades from a previous attempt at this with the DS3300.

Xenserver Enterprise supports shared storage, but as of version 4.0.1, it only supports NFS or iSCSI shared storage officially. I’d had problems getting the openiscsi software iSCSI stack that Xenserver ships with to communicate successfully with the DS3300 however, and ran out of time. On the other hand, FC shared storage is just not supported at all yet. There’s a forum article explaining Xensource’s position on the support, which also links to a knowledge base article describing how shared storage works in Xenserver. The article was pulled with the release of 4.0.1:

“We took it down because, with the level of testing and integration we could do by initial release of 4.0, we couldn’t be any further along than a partial beta. There are business reasons we couldn’t ship a product as released while describing some of the features as “beta”, and it is hypocritical for us to officially describe a way of using the product yet describe that as “unsupported.” For that reason, until we are ready to release supported shared Fibre Channel SRs, we’re not going to put the KB article back up.”

The article describes the overall setup you have to have in place to have FC shared storage working – namely array, LUN and zoning management on the SAN and HBA, and locating the device node that maps to the corresponding LUN. And then there are two commands to run, on one node of your Xenserver pool, and your shared storage system is up and running.

At this point it took about another 5 minutes to create a Debian Etch domU, and verify that live migration between physical hosts was indeed working.

I set up a slightly more complicated scenario, which I’m planning on using with some other software to demonstrate a HA / DR environment, but which also enabled me to do one of those classic cheesy live migration “but it still works” tests – I connected a Wyse thinclient through to a terminal server domU, migrated the TS between physical hosts, and demonstrated that the terminal session stayed up. A ping left running during this time experienced a brief bump of about 100 ms during the final cutover, but that’s possible attributable forwarding-path updates in the bridge devices on the Xenserver hosts. Either way it works well. Moving a win2k3 terminal server took about 15 seconds on the hardware I was working with.
Overall, I’m quietly encouraged by this functionality and it’s ease of set up. I’m perhaps a bit underwhelmed by it too – it ended up being such a nonevent to get it working, that I’m annoyed I didn’t get round to setting up the demonstration months ago.

Categories
General NSP WLUG

Benchmarks of an Intel SRSC16 RAID controller

One of our clients gave us an Intel server with an Intel SRSC16 SATA RAID controller and a 500 GB +hotspare RAID1 set up on it, to install XenServer Express 4.0.1 system. While building the system up for him, I noticed abysmal write perfomance. It was taking around 29 minutes to install a guest from a template, a process which basically involves creating and formatting a filesystem and unpacking a .tar.bz2 file into it. Inspection of the system revealed that the controller lacked a battery backup unit (BBU), and thus the writeback cache was disabled. Also, the firmware on the controller disabled the on-disk cache as well, and the controller listed disk access speed at 1.5Gbps, which I’m presuming means it was operating in SATA-1 mode, so no NCQ either. The controller has 64MB of cache.

I persuaded the customer to buy the BBU for the system, and then ran some quick bonnie++ benchmarks, which I know aren’t the best benchmark in the world, but show a good indication of relative performance gains. Results are as follows:

Note: I didn’t do the tests right either – not specifying a number of blocks of files to stat results in those tests completing too soon for bonnie to come up with an answer. So, the output below only really shows throughput tests, as the sequential create/random create tests all completed too soon. Changing the disk cache settings requires a reboot into the BIOS mode configuration tool, so I’ve avoided doing this too many times. Changing the controller cache settings can be done on the fly.

[code]

RAID controller writeback cache disabled, disk cache disabled:

Version  1.03      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.loca 512M  1947   4  2242   0  1113   0 10952  18 36654   0 169.7   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
localhost.locald 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ ++

RAID controller writeback cache enabled, disk cache disabled:

Version  1.03      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.locald 512M  7938  19  9195   1  4401   0 28823  50 41961   0 227.0   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
localhost.locald 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++

RAID controller writeback cache disabled, disk cache enabled:

Version  1.03      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.loca 512M 19861  47 17094   1  9870   0 28484  47 41167   0 243.8   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
localhost.loca  16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++

RAID controller writeback cache enabled, disk cache enabled:

Version  1.03      ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.locald 512M 38633  95 40436   4 15547   0 32045  54 42946   0 261.4   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
localhost.locald  16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
[/code]

Enabling the only the controller write-back cache (64MB in this case) roughly quadrupled the write throughput in all cases. Enabling only the disk cache provided nearly 8 times the performance on it’s own. And enabling both together increased write throughput by about a factor of 20.   I suspect the tests weren’t large enough to actually tax the cache systems on the disk or controller however, as I was running them in a Xen domU with only 256 MB of ram, and actually just wanted some quick results.

I know they aren’t really representative of anything, but here’s a test that is semi-representative: Installing another copy of the Xen domU via a template took 2minutes 55 seconds with disk cache enabled, and 2 minutes 30 seconds with disk cache and controller cache enabled (I didn’t test this with just controller cache enabled as that would have required a reboot and manual intervention, and I wasn’t onsite at that point).  Prior to enabling the disk cache and controller cache, this was taking nearly 30 minutes.

While the above shows that a combination of the controller write-back cache and the disk cache shows the best improvement, merely enabling the disk cache on it’s own had the biggest single effect. Of course, the disk cache isn’t backed up by a battery, so there’s the risk of losing the data that is in the disk cache at the time. The Intel documentation for the controller implied that this is limited to the sector that is being written at the point of powerfailure.

When I get some free time and a SCSI or SAS server, I’ll do some similar benchmarks up for that.

Categories
General

Touchgraph Google Browser

Saw the TouchGraph Google Browser mentioned on an irc channel today. Touchgraph make graph based visualisation, showing spatial relationships and correlations between data. The google browser uses results from a google search as the input; Touchgraph have Facebook and Amazon visualisations as well, but I guess their main business is visualising data warehouses and the like.

Categories
General NSP WLUG

Debian Etch and apt-proxy issues

Debian Etch (4.0) was released on Monday, and I have to say I wasn’t at all prepared. I’ve got about 70 machines that will probably need to be upgraded to Etch at some point in the near future. I could leave some of them running sarge, but I’ll definitely have to upgrade most of these servers.

We use an apt-proxy internally, to improve apt performance. It works well, aside from a couple of bugs that cause it to lock up every now and then. While running some upgrades on out of the way servers today, I discovered that the version of apt in sarge really doesn’t play very nicely with an etch repository served by apt-proxy running on an etch server. It seems that Ubuntu is fine, and trying to update a sarge server via an apt-proxy running on an etch server is ok too.

Once the etch client has been upgraded, the etch apt-proxy works fine. So, looks like a key issue. The version of apt in sarge doesn’t have the archive security stuff in it, and has no way of checking whether the keys are intact – BUT, it still seems to care, and will timeout and eventually fail.

It turns out that installing a copy of apt from the sarge backports solves this. You’ll also need the gnupg package, but the one from sarge is OK
[code]
wget http://backports.org/debian/pool/main/a/apt/apt_0.6.46.4-0.1~bpo.1_i386.deb
wget http://backports.org/debian/pool/main/d/debian-archive-keyring/debian-archive-keyring_2006.11.22~bpo.2_all.deb
dpkg -i *.deb
apt-get update
[/code]

Categories
advocacy General NSP WLUG

Puppet – a system configuration tool

I saw a couple of blog posts about puppet recently. I’ve been meaning to investigate cfengine for a while now, and puppet was a new angle on the same problem. From the intro:
Puppet is a system configuration tool. It has a library for managing the system, a language for specifying the configuration you want, and a set of clients and servers for communicating the configuration and other information.

The library is entirely responsible for all action, and the language is entirely responsible for expressing configuration choices. Everything is developed so that the language operations can take place centrally on a single server (or bank of servers), and all library operations will take place on each individual client. Thus, there is a clear demarcation between language operations and library operations, as this document will mention.

It’s very new still, and is under active development. It seems to have been designed with fixing some of the hassles of cfengine in mind. It is written in ruby and has a reasonably powerful config language, and you can use embedded ruby templates for dynamically building up content to deploy. I have no particular preference for ruby – in fact, this is the first time I’ve used the language. Configuration is stored in a manifest on the puppetmaster server, and is based on the notions of classes and nodes. A node can inherit from multiple classes, or can merely include a specific class if certain criteria are met. Subclasses can override specific details of a parent class.
It makes use of a library called facter (also written by reductive labs), to pull information ‘facts’ from the client hosts, and these can be used in the manifests to control configuration. For example, it will work out the linux distribution you are running and store this in a variable, and you can use this to determine which classes to run.  It is fairly easy to extend facter to support additional facts – so I added support for working out the Debian and Ubuntu release number and codename – eg, 3.1 and sarge, or 6.10 and edgy.
There is a dependancy system in place, so that you can specify a rule to ensure that a service is running, which depends on the package being installed. If you use puppet to manage the config file for the service, you can set a subscription on the file for the service, so that if a change to that file is pushed out via puppet, it will restart the server for you as well.

Installing packages is handled well, with the option for seeding debconf if appropriate. Puppet understands several package management formats, including apt, rpm and yum.
I’m by no means an expert with cfengine, but this feels a lot nicer to use. After my initial testing, I see no reason so far to not deploy this at work. I’ll test try a test deployment on some systems, and if that works out I’ll push it the whole way out.

Categories
General WLUG

Backspace in Firefox 2

As part of my upgrade to Edgy the other day, Firefox was upgraded to 2.0. It’s been upgraded every day since then, and is I think finally running a real 2.0 build

[code]

$ apt-cache show firefox | grep Version
Version: 2.0+0dfsg-0ubuntu3
[/code]

The biggest interface changes I’ve noticed to Firefox 2 so far include some cosmetic changes to the tab panel layout, which I’m mostly used to, and the ‘backspace’ button now no longer steps backwards in your history.

This behaviour is controllable via about:config however. Setting the following will revert to the old behaviour.
[code]

browser.backspace_action = 0
[/code]

Categories
General WLUG

Edgy Eft RC1 announced

After seeing this announcement for the Edgy Eft RC1 release, I decided to upgrade my Dapper laptop to Edgy. Thanks to the NZ mirror already being up to date, it didn’t take long to download the 700MB of packages that I needed.

I’d like to say the upgrade went smoothly, but it didn’t. Part of that is my own fault – I accidentally used apt-get instead of aptitude to handle the upgrade, and so a lot of packages were missed, and some dependency resolution was fumbled which meant the upgrade process broke hard along the way.

After manually removing a bunch of packages then getting the upgrade to restart, then repeating “aptitude dist-upgrade” about 6 times after it thought it’d finished, each time installing a couple of new packages, and then finally rebooting one more time because I couldn’t get X to start again, it all looked good.

Except that when I logged in, GNOME didn’t appear to start. I killed X and added a new user, then logged in as them – worked fine. Tried my user – no go. I spent a long time trying to move various GNOME configs out of the way, and eventually resorted to creating a new blank homedir for myself – still wouldn’t work. So I rebooted one more time and it started working after that. Very strange.

I’d suggest waiting for the final release to upgrade, but if you do go ahead, make absolutely sure you use aptitude and not apt-get. It may also work better if you use the CD and boot into an upgrade mode, I can’t comment.

I would file a bug, but I’m not sure it’ll help. I can’t pin down what was wrong because I used the wrong tool to upgrade. I have a Dapper install on my desktop at home, and I’ll try upgrading that next week when I get some free time, however it’ll probably “just work” by then anyway.

New things noticed in Edgy Eft so far:

  • Firefox 2
  • Network-manager-applet has a dialup account plugin.

Yeah. It looks the same. Edgy does have new features under the hood, but I haven’t looked into those yet.

Update: Yeah, it’s called Edgy Eft, not Efty Edge.