Caucus Master/Slave Servers:
A Concept Paper

Charles Roth, 17 March 1999

This document is a brief concept description of an extension to the Caucus server architecture that could provide a substantial improvement in perceived end-user speed, as well as other capabilities.

I. Introduction -- The Problem

Caucus 4.0 is based on a single, central server architecture.  Both data and interface flow from the same central server, regardless of where the client is.  This is, of course, the standard web model. (Although static web pages are often served from a localized "proxy server" cache.)

This means that for a sufficiently remote (in network terms) client, turn-around can be slow.  This can be particularly annoying for (what appears to the end user to be) small, "simple", tasks, that nonetheless require at least one round trip to the server.

There has been much discussion about other approaches to this problem.  They include:

  1. Making the individual interface pages smarter -- using more javascript to automate the "simple" tasks within the current page, and do more of the work in the user's browser, which acts immediately.

  2. Put more of the interface at the client end, by using Java applets to automate part of the interface.

  3. Put all of the interface at the client end, by writing a completely new client package that is installed on the end-user's PC.  Presumably, it would not use a web browser at all, or only as an accessory.  This package would just pull data from the central server, and might even cache some data locally.

So far, the sense of the market seems to be that "thick" clients (as in #3) are just not very acceptable.  The browser is seen as ubiquitous and supportable, and thick clients that must be installed for each users are seen as a big pain.  Java (as a browser applet) is still suspect; "write once, debug everywhere" still seems to be the cry.

If this is true, then we're still stuck with the browser.  Individual interface pages can be made smarter, but there is still a fundamental limit to how much improvement in speed can be made.

II. A Caucus Caching Server?

There's another possibility, however.  The connection between the end-user and the local proxy server is usually pretty fast; it's the turn-around to the far-away remote server that is slow.  Most large organizations will or could easily have a proxy server.

Supposing we had a method wherein there was one Caucus "master" server (equivalent to the central, presumed-to-be remote server), and many Caucus "slave" servers (loosely equivalent to local proxy servers).  Each end-user gets the data and interface from a Caucus CGI program that resides on a local server; the local servers in turn are "slaved" to a master server, and get and put (just) data to and from it.

At the very least, this could mean that any operation that involves displaying data would be much faster; actually adding data (such as new responses) might be just as slow as it was before.

III. CaucusLink

Back in the early 90's, Camber-Roth developed a version of Caucus that supported "distributed conferencing".  This version was called CaucusLink, and supported a fully distributed, replicated, peer-to-peer Caucus database. Updates between databases were sent on a regular basis (a few times a day) via e-mail, which was automatically sent, received, and parsed by special utilities written for the purpose.  Multiple Caucus sites could install CaucusLink and "share" individual conferences; the members of a given conference might be spread across multiple hosts, yet each member had instantaneous access to the conference at all times.

CaucusLink worked... but there were synchronization problems that were never fully resolved, and it was only ever used for small groupings of sites. (E.g., Camber-Roth, Interjoin, TMN, and Bellevue Schools).  It was primarily meant to solve the problem of the lack of instant access between different computer hosts... a problem whose nature was dramatically changed by the explosive growth of the Internet.

Still, some of the technology built into CaucusLink could be reused to provide a master/slave server relationship as described above; the synchronization problems are much simpler when there is always a single master, rather than a network of complete peers.  Essentially, this would mean solving the problem stated in (I) at or just below the "application" layer.

IV. Synchronized Databases

A completely different approach to solving the same problem might be to use an existing, standard, database, that internally supports database copying with regular synchronization to a master.  Data fields would be explicitly marked as to whether (and how long) they could remain "out of synch".  For some fields it might always be necessary to go to the master; others might always be cached copies that derive from the master and are updated at intervals; still others might be synchronizable to the master at intervals.

This would require a complete rewrite of the database... but that is being contemplated in any case.  It has the advantage of moving the master/slave issues much further down in the layer model, which makes it easier to debug,and more independent of later changes in Caucus.