Keeping Web Services Healthy with Automated Live Testing

Suppose your organization needs to ensure a collection of web services are up and running and responding to client requests as expected. You also need to be able to know with minimal latency when something goes wrong, so that your team can take immediate action.

These applications could range from simple websites, to more sophisticated web services (APIs) interacting with a set of databases in the back-end.

The combination of Nagios + WebInject (both open source projects) is a good solution for automating the process of live testing a web service and sending alerts when things start going afoul, allowing your team to respond immediately before too many customers are impacted.

Nagios + WebInject allows you to write a fairly rich and customized set of live tests of a web site or service, including the specification of customized triggers for the sending of automated alerts to designed parties when things start going wrong.

A bit more details about each tool:

Nagios is an opensource platform for monitoring the state of web services. It is highly extensible through its plugin architecture and comes with a web admin front-end.

Screenshot of the home page of a fresh nagios install.

Screenshot of a fresh Nagios install.

With the Nagios ‘core’ install one can perform simple periodic health checks on a collection of domains using pings, and can fire off email alerts when a server / domain is not responding.

WebInject is available as a standalone tool as well as a Nagios plugin which makes possible more sophisticated tests, for example, the testing of specific endpoints of a given web service / API, not just with respect to up-time but also with respect to checks on the validity of data returned from an endpoint in response to a particular set of requests.

For example, using WebInject with Nagios one can automate the login / log-out of a website, check that the querying of an API endpoint linked to a DB returns an expected result given the request parameters, and so on, all at customizable time intervals. And when something goes wrong, the system can be configured to alert the dev team right away via email.

Pretty powerful stuff and essential for ensuring quality in the deployment of any serious web site / service to production.

No fancy tricks or popups, simply an article like the above, which I write a few times a month - just for my subscribers.