Categories
jabber linux Tool of the Week

Using monit for system and process monitoring

One of the servers I maintain is the jabber server at jabber.meta.net.nz. This is a free public service, anyone can use it, and it does get quite a wide range of use – for a long time we seemed to be very popular for south american users, possibly because of the web based clients and the range of transports to other protocols we support. We typically see between 50 and 100 concurrent users, depending on time of day and week, but the active account base is normally in the low thousands.

The transports themselves cause me a lot of problems. In the past they’ve been downright buggy, crashing all the time, but with the current codebase for all four protocols in use (AIM, ICQ, MSN and Yahoo) all being in python, we don’t seem to have as many outright crashes. We do have slow memory leaks however, which prompted me to move the services to a new server a while back. Part of me was hoping that the memory leaks were caused by the gentoo system I was using initially, but this doesn’t seem to be the case.

So, I needed to either fix these memory leaks, or to work around them. Enter monit. I’ve heard about monit quite a bit, but never really looked into it other than thinking it might be interesting. I really wish I’d looked further ages ago. It’s easy to set up, is designed specifically to monitor and restart services, and it solved my memory leak problem in about 5 minutes.

Here’s a snippet from the config file:
[code]
check process aim-transport with pidfile /var/jabberd/pid/aim-transport.pid
start program = “/etc/init.d/aim-transport start”
stop program = “/etc/init.d/aim-transport stop”
if cpu > 60% for 2 cycles then alert
if cpu > 80% for 5 cycles then restart
if totalmem > 300.0 MB for 5 cycles then restart
group transport
[/code]

This is pretty self explanatory really. If CPU usage of this process gets too high, alert, then restart if it stays high for 5 cycles. And if the ram usage is over 300 MB for 5 cycles (a cycle is 2 minutes by default), restart the process. Problem solved. Or rather, the symptoms are solved, but that’s good enough for me at this stage

NoteThis is old, but somehow didn’t get posted

3 replies on “Using monit for system and process monitoring”

Hey, can’t work out how to contact you. So this’ll have to do. You run jabber.meta.net.nz right?

I use your MSN transport, on and off, and it’s down at the moment. Or rather, it’s up, but the pymsnt (ejabberd) transport has a bug that has recently showed up with some Microsoft server side change.

Long story short…

Thread => http://groups.google.com/group/py-transports/browse_thread/thread/8e3583a86a4bb053/41f6578accf1b1be?lnk=gst&q=%22problem+with+CVR0+in+version+string%22#41f6578accf1b1be

Patch => http://py-transports.googlegroups.com/web/pymsnt-version.patch

Hi Glenn,

I’ve noticed that the MSN transport is down, but haven’t had any real time to look into it. I did look into it today and got as far as (re)discovering that the author of pymsn-t stopped developing it at the start of last year… Thanks for pointing me at the fix, I’ve applied it now.

I’ll also think about updating the webpage at http://jabber.meta.net.nz/ to make it more obvious how to contact us. The old page: http://jabber.meta.net.nz/old/ has the admin’s jids on it at least :)

If the yahoo.jabber.meta.net.nz transport still working? I think I have been having problems integrating to my iChat client over GoogleTalk as the Jabber account…. but then see your Y! Transport is down. I am just looking for a sync point here.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.