Slow SSH connections with GSSAPIAuthentication enabled

I hit an issue at work where SSH connections to some servers would stall for around 80 seconds. This seems like it only started happening recently, and it wasn’t happening for any of the systems I use on a day-to-day basis, but some of our developers were seeing the problem and determined that disabling GSSAPIAuthentication “fixed” the problem.

We predominantly use Ubuntu on the desktop, and CentOS derivatives on the servers. However, this stalled connection didn’t occur for all servers. It also didn’t happen when we made an ssh connection between CentOS servers, just when going from Ubuntu to CentOS. Doing some googling showed a bunch of blog posts and bug reports about this very issue, and the first ones I looked through all said the same thing – the fix was to disable GSSAPIAuthentication.

I wasn’t happy with that as a fix, simply because it didn’t sound relevant. We don’t use GSSAPIAuthentication at all; it’s on by default in the CentOS sshd configuration files, and it’s on by default in Ubuntu and Debian ssh configuration files, but it had never been a noticeable problem until recently. We hadn’t changed anything in our server setups. It was possible we’d changed something in our desktops (we upgrade Ubuntu every now and then, for example), but some fairly old installs were exhibiting the same problems to the same servers.

It felt like a DNS issue, but all the servers had a DNS record. Then I found a bug report that suggested it was related to Avahi, the mDNS daemon. Sure enough, disabling avahi on my client meant all ssh connections were fast. Then, after far too much incidental messing round, the penny dropped. The GSSAPIAuthentication was triggering a reverse DNS lookup on the host being connected to. If this existed in DNS, the connection was fine. If it didn’t, it was doing a reverse DNS lookup via mDNS, which ended up stalling for a long time.

Of course, now that I know this is the case, I can find any number of blog posts and bug reports that spell it out. I’m writing this one just to add some signal to the noise:

If your SSH connection is slow and looks like it’s stalling inside GSSAPI, check that you have valid forward (A) and reverse (PTR) DNS records for the host you are connecting to!. Disabling GSSAPIAuthentication will work, but it hides the real problem, which (in this case), is that for whatever reason your system is falling back to an mDNS lookup and is failing. Disabling GSSAPIAuthentication is simply hiding the symptom – you either need to fix your DNS setup, or tell your host to stop using mDNS.

And now I’m reminded of one of the best-titled blogs I read: Everything is a Freaking DNS Problem.

(This is an old post that I wrote in 2011, but never posted for some reason…)

Entropy and Monitoring Systems

I use munin for monitoring various aspects of my servers, and one of the things munin will monitor for me the amount of entropy available. On both my current server and my previous one I’ve noticed something unusual here:

According to munin, I’m almost perpetually running out of entropy. Munin monitors the available entropy by chekcing the value of /proc/sys/kernel/random/entropy_avail, which is the standard way you’d check it. My machine has several VMs running, and hosts a few services that use entropy at various times (imaps, ssmtp or smtp+tls, ssh, https), so it’s not unreasonable that I may have been entropy starved. If my entropy levels are always around the 160 mark, it’s likely that at any given time I’m totally starved of entropy, so anything using encryption will stall a bit.

I had a brief look into various entropy sources, such as timer_entropyd or haveged, but none of them seemed to help. I’d seen several references to Simtec’s entropykey, which looked very promising, so I ordered one from the UK, which arrived a week or so ago.

I’ve yet to arrange a trip to the datacentre to install it however, and after a bit of poking round today I’m not so sure it’s as desperately needed as I thought

I randomly checked on the contents of /proc/sys/kernel/random/entropy_avail, just to see what it was like. There were over 3000 bits of entropy present. Very odd. I repeated this several times, and watched the available entropy decrease from over 3000 down to around 150 or so, the same as in my munin graph above. I repeated this about a quarter of an hour later, with the same results – over 3000 entropy, rapidly decreasing to very little.

After a bit of further digging, I found this blog post, which mentioned that creating a process uses a small amount of entropy. The author of that post was seeing problems with his entropy pool not staying full, which sounds like what I was seeing. I’m still not clear on what requires entropy though, as some of my systems at work clearly don’t deplete the entropy pool during process creation.

So, I did some different monitoring: Check the value of entropy_avail every minute, through a different script. The graph below shows the results:

Clearly, entropy is normally very good, but is dropping down to very low levels every 5 minutes. It replenishes just fine in the intervening 5 minutes however, which suggests that I don’t really have a problem with entropy creation, just with using it too quickly.

As for the question, “why is my entropy running out so fast?”, the answer is quite simple: Munin. On my host machine, munin runs around 50 plugins, each of which generally calls other processes such as grep, awk, sed, tr, etc. I don’t have exact figures on how many processes were being kicked off every 5 minutes, but I wouldn’t be surprised to find it was hundreds, all of which used a little bit of entropy

I’ll still install the EntropyKey, and maybe it’ll help my pool recover quicker.

Creating a DOS USB bootdisk under linux

Every now and then I need a DOS bootdisk to flash a BIOS or similar, and I only have linux with which to create it. I can never remember the quickest way to do this, so I’m documenting it here:

Lifted entirely from this webpage. I’m only archiving it here because content disappears over time.

I needed to upgrade the bios of my Computer (Intel).

But how to do it without windows?

In my case, Intel has many options for bios upgrading and one is the plain old DOS method. This is the best and fastest way to upgrade your bios with linux.
Create a FreeDOS based bootable usb-stick

* Download a FreeDOS image, i’ll use Balder for now.
* Prepare the usb-stick
o check partition (e.g cfdisk /dev/sda)
o mkfs.msdos /dev/sda1

Commands

qemu -boot a -fda balder10.img -hda /dev/sda
A:\> sys c:
A:\> xcopy /E /N a: c:

Check with

qemu -hda /dev/sda

There are, of course, many ways to do this. With recent VirtualBox versions supporting USB passthrough, I could do it entirely from a windows VM. Several other websites suggest installing grub onto the USB disk and having it boot a floppy disk image directly, which also seems like it would work. Your FAT-formatted USB drive would appear as C:, and you can just copy whatever content you like straight onto that.

Changing Putty’s right-click behaviour

Anyone who uses Putty a lot will inevitably accidentally paste something into their putty window due to Putty’s right-click to paste default behaviour, often resulting in pasting relatively embarassing content like SQL that should never be seen in daylight, or at least a customer’s root password.
However, I found this putty enhancement request today. Note at the bottom:

Update: as of 2003-11-21, there is a new mouse-handling option whereby the right button brings up a context menu containing a Paste option, rather than pasting directly.

And sure enough, putty supports this. It also supports “xterm mode”, which is a more standard middle-click to paste mode. Solved!

QOS and IP Accounting with BGP under linux

At NSP we’ve go a fibre connection into the building, and a 10MBit feed from our ISP, and over that we’re allowed 10MBit of national and 3 Mbit PIR of international traffic. Note that this adds up to more than 10Mbit in total! This can cause annoying problems, like someone doing a lot of national or APE traffic at 10MBit, and closing out real international traffic. For a long time I’ve wanted to separate this out, but have not had the time to look into it

This week I finally organised a BGP from my ISP, and had a look at what my options were. I’d seen the Route-based QOS mini-HOWTO a while back, and it looked like it would work ok, but had a few problems. There’s no current way it to apply tc or iptables rules selectively based on a routing decision, or even on a route table. You can match on a route realm, however. The mini-HOWTO suggests copying your BGP routes into a separate table and into a realm at the same time, and then using tc and iptable’s realm matching code.

A quick aside: route realms are best described as a collection of routes. The decision as to which realm a route is placed is made by the local administrator, and each realm can contain routes from a mix of origins. Realms are used to allow administrators to perform bulk operations on large groups of routes in an easy manner. From the iproute command reference:

The main application of realms is the TC route classifier [7], where they are used to help assign packets to traffic classes, to account, police and schedule them according to this classification.

After a bit of digging, I found a link to a patch for quagga to provide route realms support. It’s even still maintained! After a bit of battling with autotools[1], and a bit of battling with linux capabilities[2], I had it up and running.

The route realms patch page covered off the BGP configuration I needed, and now I have a set of iptables counters for national, international and total traffic (for completeness). The only bit it doesn’t cover off is graphing, but we already have a set of perl scripts which pull information from interface totals or iptables FWMARK counters, so I modified that to pull from these counters as well, and set up RRD graphs. I was previously graphing interface totals out the external nic, and it’s interesting to note that the iptables “total” traffic, while adding up to the sum of national and international, does not correspond to the interface totals.

It’s worth pointing out that, as seen in iproute command reference, the rtacct tool will grab realm counts for you without needing iptables, so if you just want to something to graph things quickly, rtacct might do the job:

#kernel
Realm BytesTo PktsTo BytesFrom PktsFrom
BPSTo PPSTo BPSFrom PPSFrom
unknown 5949K 57188 15839K 61776
0 0 0 0
national 15839K 61776 5949K 57188
0 0 0 0

rtacct has a naive limit of 256 realms however, where as the actual implementation supports a 16 bit number, so if you have a large number of realms, or you autoclassify your inbound BGP into realms based on the AS number, you will have to use iptables only

I’m currently only accounting for traffic using this mechanism, but I can also do QOS on it – tc will match directly on realm tags, and any iptables based match systems you may have can be adapted to match on a realm as well.

[1] The realms patch touched configure.ac, which then required the autotools chain to rebuild everything, but it needed a very particular combination of autoconf and automake. Because it took me an hour or so to get this right, I’ll record it here:

patch -p1 < ../quagga-0.99.5-realms.diff
aclocal-1.7
autoheader
autoconf
autoconf2.50
libtoolize -c
automake-1.7 –gnu –add-missing –copy
./configure –enable-realms –enable-user=quagga –enable-group=quagga –enable-vty-group=quaggavty –enable-vtysh –localstatedir=/var/run/quagga –enable-configfile-mask=0640 –enable-logfile-mask=0640

autoheader and autoconf above are version 2.13. I have no idea why I had to run autoconf2.13 then autoconf2.50, but it seems that this actually worked.

[2] I initially tried building against quagga-0.98.6, because the quaggarealms patch site implied this was the “stable” verson, but it seems that quagga drops priviledges too soon. This works out fine if you have “capabilities” support in your kernel, which mine didn’t. They’ve changed this behaviour in 0.99.5, and incidentally this is the version in debian etch.

Exporting Tape Autoloaders via iSCSI

A while ago I posted about {{post id=”iscsi-for-scsi-device-passthrough-under-xen-enterprise” text=”exporting a tape drive via iSCSI”}} to enable windows VMs to backup to a SCSI tape drive under Citrix Xenserver. I spent a couple of hours googling for whether or not you could do the same thing with a tape autoloader, and didn’t find a lot of useful information.

So, I just dived in and tried it, and it turns out exactly the same process works fine for exporting a tape autoloader via iSCSI as well, as long as you are slightly careful about your configuration file.

First of all, find your HCIL numbers with lsscsi:

[4:0:0:0] tape HP Ultrium 4-SCSI U24W /dev/st0
[4:0:0:1] mediumx HP 1×8 G2 AUTOLDR 1.70 –

So, we’ve got an HP Ultrium 4 tape drive on 4:0:0:0, and a 1×8 G2 Autoloader on 4:0:0:1. Let’s configure IETd:

Target iqn.2007-04.com.example:changer0
Lun 0 H=4,C=0,I=0,L=0,Type=rawio
Type 1
InitialR2T No
ImmediateData Yes
xMaxRecvDataSegmentLength 262144

Lun 1 H=4,C=0,I=0,L=1,Type=rawio
Type 1

A couple of points to note:

  • I’ve named it changer0, you don’t have to
  • You do have to make sure both the tape drive device(s) (in this case, 4:0:0:0) and the changer device (4:0:0:1) are exported as different LUNs under the same target
  • The other options (InitialR2T, ImmediateData etc) may or may not work for you, consult the IETd documentation for what you actually need and want.

Once you’ve restarted the iscsi target, you can load up an initiator and connect to it, and you should see both devices being exported under the one target. If you accidentally use a different target for the changer and the tape drive, you’ll find that your backup software probably can see the changer device, but will tell you there are not available drives.

Looking up .local DNS names under OSX

My workplace uses a .local DNS suffix for all internal DNS, which of course causes problems when you’re running a system which uses any form of mdns – such as OSX or Ubuntu (or probably any modern Linux distro, I know SuSE had this problem about 6 years ago). The .local lookups fail, because mdns takes over. (Thanks John and Phil for reminding me of this). This shows up as resolution via host or dig working fine, as they make calls direct to your nameservers, but commands like ping failing, as it uses the NSS to do the lookup.

A quick bit of googling, and I found this gem on Apple’s website, and also this one on www.multicastdns.org. Apple’s suggested fix didn’t seem to work, but I suspect a reboot is required. I’ve applied the second one, and rebooted, and one of them is definitely working.

As an aside, this started with me wishing that it was possible to do per-domain resolver configuration. I initially gave up and set up dnsmasq which forward on requests to specific domains to specific servers, but then hit the mdns issue. This method looks very much like a per-domain resolver configuration however – it’s saying to use my local DNS server for .local lookups. I haven’t tested it, but it looks like it should support setting an arbitrary resolver for an arbitrary domain.

Why I love dmidecode

I was asked to provide more ram for a server today, specified only by name. I have login details, but it’s in a datacentre in Auckland, and I’m in Hamilton, so I can’t wander over to check details.

Enter dmidecode:


System Information
Manufacturer: Dell Computer Corporation
Product Name: PowerEdge 860

That’s basically all I need right there. Having a namebrand machine helps, of course – getting the same sort of information from a generic motherboard isn’t as easy or useful. However, while checking which ram banks are populated I can also (typically) get the type of ram as well:

Handle 0x1100, DMI type 17, 27 bytes
Memory Device
Array Handle: 0x1000
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 1024 MB
Form Factor: DIMM
Set: 1
Locator: DIMM1_A
Bank Locator: Not Specified
Type: DDR2
Type Detail: Synchronous
Speed: 533 MHz (1.9 ns)
Manufacturer: 7F7F7F0B00000000
Serial Number: 7A947291
Asset Tag: 0D0718
Part Number: NT1GT72U8PB0BY-37B

In this case, the other “Memory Device” entries had “No module installed” in the Size: section, so I know that this machine has one (1) 1GB DDR2-533 DIMM installed.

Of course, that output doesn’t seem to tell me that the Dell PowerEdge 860 wants ECC ram (although I know that anyway). And the output from dmidecode on a newer machine:

Handle 0x1100, DMI type 17, 23 bytes.
Memory Device
Array Handle: 0x1000
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 2048 MB
Form Factor:
Set: 1
Locator: DIMM 1A
Bank Locator: Not Specified
Type:

Type Detail: Synchronous
Speed: 667 MHz (1.5 ns)

That’s from a brand new HP DL360 with FB-DIMMs, so I guess my version of dmidecode on this machine isn’t new enough to handle that.

In general though, it’s more than good enough :)

Further bugs in switches

Further to my {{post id=”opennms-and-bugg-switches” text=”previous post”}} on the topic of uncovering bugs in switches with network monitoring systems, I have two more bits of news:

The first is that the vendor has given me a new firmware for one of the switch models which fixes the bug. Apparently it was a known but undocumented bug. Still waiting on a firmware for the other switches.

The second is that I have found another bug in the switches – this time with HTTPS. It’s not triggering during autodiscovery this time, and the bug takes a bit longer to manifest (2-4 weeks, it seems), so it’s slightly harder to track down. I’ve got to set up a test rig to hammer some of the switches with HTTPS connection attempts and see what shakes loose.

OpenNMS and buggy switches

One of my evening projects has been setting up OpenNMS to monitor a network primarily comprised of VENDORNAME switches. OpenNMS is being put in to replace a bundle of Nagios, Cacti, Smokeping, and Groundwork Fruity for Nagios configuration management. The existing system worked well enough, but the lack of autodiscovery of services/nodes along with the poor integration between cacti and nagios was getting a bit annoying.

After setting up and trialling OpenNMS for a bit, we deployed it on this network. And then the switches started failing. They’d still switch packets, and I believe still responded to SNMP, but you couldn’t connect to them via any of the management interfaces.

So, we started looking at the differences between OpenNMS and Nagios/Cacti/Smokeping. Both do SNMP and ICMP queries, and some TCP port availability checks. The combined stack actually does more SNMP traffic because both Cacti and Nagios ended up querying the same OIDs. I’ve often noticed that Cacti sends individual requests for OIDs however, rather than grouping them, whereas OpenNMS defaults to requesting 10 OIDs per PDU. I changed this in the configuration (and later on changed it for real, as it was being set in a different config file as well), and let OpenNMS run against some test switches… and they locked up.

Perry suggested that it could be a memory leak due to the service polling, and set up a test where he polled the SSH server once a minute forever. This test got cancelled after 4 days or so, but the machines hadn’t died at that point, so we decided it wasn’t anything fundamental about the service checks.

I set up a range of services that were being monitored on 10 switches, and let them go for a bit. Due to power outages and equipment moves this step ended up taking longer than it needed to, but the end result was that no matter which services were being monitored, all the switches all locked up at around the same point.

And then I noticed that the switches had a growing number of stale “telnet-d” connections. These switches have capacity for up to 4 concurrent administrative logins – once all 4 slots are full, you can no longer log in. So, the theory is these stale connections were blocking real connections, and never timing out, thus causing the lockout of the management stack. They don’t time out, and you can’t kill them from the switch console short of rebooting the switch. Most of the switches weren’t being actively monitered for telnet, but OpenNMS does do service discovery periodically (I think once a day, and perhaps under other situations too), and this would probe each service. So I firewalled telnet out, and had the switches restarted, thinking this would solve it.

The switches still locked up.

The switches still had stale telnet connections appearing in them.

I turned off the telnet service on each switch, thinking that perhaps there was something else on the network that was talking to them, and restarted them.

Within 5 minutes of rebooting each switch, there was a stale telnet connection listed. Awesome.

So, we’re down to a service that is being misreported as a telnet service. I go through all of them, and discover that none of the other services – FTP, HTTP, HTTPS – even show up as an active session. Which leaves telnet – firewalled out – and SSH.

The OpenNMS plugin which handles discovery of SSH servers is a bit smarter than a basic “is a service listening on port 22” sort of discovery – it waits for the SSH banner from the server, then sends it’s own SSH banner back, and verifies that it gets a response back. This means it’s getting part way through the SSH establishment, and then canning the connection.

As a quick test, I telnetted to port 22 on a switch and checked the login listing. With the banner is being displayed, nothing even shows up. When I pasted a valid looking SSH banner back, I got a bunch of binary data echoed into my telnet session, and ssh session to the switch locks up. On reconnecting and checking the login listing, sure enough – a stale telnet session was there.

Further tests reveal that if you ssh to one of these switches, but don’t type your password in, the session gets reported as a telnet session. Furthermore, if you kill your ssh process or shell window while the ssh session is waiting for your password, the session never disappears.

So, we have a very live DOS exploit against VENDORNAME switches here, assuming anyone is unwise enough to allow SSH access from random networks and VLANs to their switches that is. I have to point out that while it’s a particular “feature” of OpenNMS that triggered this problem for us, this isn’t a bug in OpenNMS at all, given that it’s trivial to trigger the same problems with the switches directly.

In regards to the actual problem at hand, OpenNMS is quite configurable, so at least I can change the way it does SSH service discovery to revert to a simple “is the port up” check. I’ve left this running for nearly two weeks now, and the switches on my test bed are all still behaving properly.

I held back from posting this until I could get a response from the vendor. They’ve acknowledged the bug, and a fix will be out in the next firmware release apparently. I might update once they have released a new firmware; I’ve edited out the vendor name from this post because I don’t believe it’s responsible to publish denial-of-service vulnerabilities without giving the vendor a chance to fix them.

I also noticed this post on the OpenNMS blog. The author there had similar problems with monitoring a firewall device, and while the scenario seems different, VENDORNAME makes firewalls as well as switches; I wonder if it’s the same vendor in his case.