Availability through Failover
The availability of many network application servers could easily be improved if a second server would automatically take over in case of a failure. This is e. g. the case with webservers, DNS servers, directories or network information systems like NIS or LDAP.
Some Firewalls include a redirection feature that can redirect requests to different hosts in a group, according to the healthiness of the hosts. Firewall-1 can do this on the basis of regular requests to the host, unfortunately, the hosts are polled at intervals that increase over time, so in the worst case it can take as much as half an hour or more for Firewall-1 to realize that the host has died. Some OSs provide high availability features that can be used to implement failover, but in most cases, only a commercial add on can solve the problem.
This package of failover utilites is designed to solve this problem in some cases. In consists in a daemon that collects state information about services so that other systems can find out about the services on a given host. The daemon is polled at regular intervals in time by a failover shell, which then decides whether a service needs to be started on the local machine or not. This shell is programmable in Tcl and can call (of course) external programs.
Looking at a single machine, however, is very often not good enough. A system like a web proxy performs many functions, like authentication, DNS resolution, HTTP- and FTP-fetches. While the system can be up, it may happen that the service as a whole doesn't work. A mail server may have similar problems: It may have a problem sending out it's mail, but when the client talks to it, it cannot tell yet. Health checker, introduced in tentative form in version 0.4.0 helps with this: services can be checked remotely and actions/failover triggered if a service fails. Unfortunately, the healthchecker was first delayed by a Pth bug, and more recently by the author's laziness and lack of time. Furthermore, alternative packages for service monitoring have appeared and can be used with failover.
Version 0.5.22 has been released on june 5th, 2009.
This release fixes 32/64bit issues. Since it changes the semantics of the last_change field, it shouldnot be used together with previous releases.
For the changes in previous releases please check out the ChangeLog.
See Getting Started for instructions
how to get a working failover setup up and running as easily as possible.
Even a more complicated symmetric setup can easily be created using the
scripts in the directory
Failover is award winning software: it has won the first prize in the first Swiss Open Source Competition.
|Please read the README and the INSTALL file in the package directory for instructions on configuring and compiling the package. For german readers, papers and slides from some presentations or workshops recently given by the author on HA technology in general and failover in particular have been made available for download (see below).|
|More details about the concepts.|
|A detailed example.
The Solaris example is the basis of what is set up during the Solaris
For a Linux setup, please use the
|Description of the Tcl extension in the failover shell.|
|For those reading german: Linux Clustering, a paper presented at the first Linux Conference in Zurich, and presentation slides.|
|Again for german readers: Open Source High Availability, handouts for a workshop on high availability solutions based on open source software held at the Fachhochschule Rapperswil under the auspices of the CH/Open, the swiss open systems users group.|
|faild||the state daemon|
|failc||the state modification and inspection|
|failsh||the scheduler which monitors the other hosts in the failover configuration and performs the actual failover|
|failstat||the state dumper|
|failmon||the real time monitor for curses, html and plain text output|
|faildebug||utility to set the debug level|
|hcmon.cgi||health checker CGI interface program|
|Download (986kB) the source distribution.|
|Healtchecker ist not functional.|
|Linux HA||Alan Robertsons Linux HA home page. The Linux HA project is somewhat wider in scope than this simple failover solution. Contains many links to related projects.|
|Our other web server...|