Since `floodd' does not provide reliable delivery, there needs to be an additional mechanism to recover from inconsistencies. The only choice in our architecture is to toss the problem to the application. Two sites will have to communicate and agree on a sequence of updates necessary to bring the databases back into sync. The `floodd' provides mechanisms to facilitate communication between the the two sites, but since it does not know anything about the structure of the data, it cannot provide further assistance.
smtp
.
Communication is conducted via a stream connection to a simple command
line interface. A simple help menu is provided for manual interaction.
A client library is provided to simplify application integration.
Displays a command summary with brief description.
Returns the membership of the given group.
Return the list of groups.
All data entered will be flooded. The data is terminated with a single `.' on a line (a la smtp).
The data entered will be flooded. The size give the number of bytes after the newline to read and flood.
Give some statistics on the flood daemon.
Display the topology for the named group.
Give the known floodd daemon nearest the client (i.e. you). (Note, this is not implemented at the moment. The daemon can give hints on the nearest daemon, but really the client needs to make the final determination. One way to do this would be to send out several pings and connect to the one that returned first. These answers should probably be cached.)
Close connection.
Terminate the flood deamon.
The parameters fall into three catagories: site parameters, group parameters, local parameters.
(:site-define ...)
list.
Site-Parameter: client-port integer
The client port on which to communicate to this site. Telneting to this port will give the user a commad line interface on which to query the flood daemon.
Site-Parameter: data-port integer
The data port on which to communicate to this site. This is the port daemons use to communicate with other daemons.
The lattitude and longitude of this site.
Site-Parameter: inet-address address
The internet address of this site.
Site-Parameter: hostname domainname
The hostname of this site. Note that the hostname might not always be a fully specified domainname, hence the utlility of hte inet-address field. It might also be the case that the hsotname cannot be resolved at all sites.
Site-Parameter: longitude float
The longitude of the site.
Site-Parameter: lattitude float
The longitude of the site.
Site-Parameter: site-name string
The name for the site. Hopefully this is a relatively meaningful name.
(:group-define ..)
.
Group-Parameter: bandwidth-period secondsHow often to perform bandwidth estimation to a randomly selected site.
Group-Parameter: connectivity integer
The connectivity of the topology to be computed. This is only relevant to the master site for the group.
Group-Parameter: estimates-period seconds
How often to distribute our estimates of bandwidth and round trip time
Group-Parameter: group-name string
The name of the group.
Group-Parameter: join-period seconds
How often we send out a join request. We do this in case of a pathological situation in which the master site goes down and needs to rebuild its group identity.
Group-Parameter: master-site site-name
The name of the master site for this group.
Group-Parameter: ping-period seconds
How often we select a site at random to ping to compute round trip time.
Group-Parameter: site site-name
The name of a site in the group. There can be several such entries.
Group-Parameter: topology-generator shell-script
The command to execute to compute the topology for the group.
Group-Parameter: update-period seconds
How often to update the topology. This is only relevant at the master site.
Local-Parameter: data-storage-max bytes
The maximimum memory to allocate for storage of data being transferred between deamons. The more memory allocated, the longer site failures can be tolerated. Once a daemon runs out of storage area, it must start dropping data. The first data to go is data waiting to be send to sites that have been classified as down. In certain cases, data waiting to be sent or exported will be dropped. This simple stragegy is employed to eliminate the problem of trying to deal with deadlocks.
Local-Parameter: export-command shell-script
The local command to invoke to export data. The export command reads from the standard input. For efficiency reasons, several updates may be bundled together in one invokcation of the export-command. If no export command is specified, data is simply dropped.
Local-Parameter: log-purge-period seconds
How often to purge logs of received data.
Local-Parameter: site-purge-period seconds
If a site has not been heard from in this time, we delete the site from our list of known sites.
A configuration file might look like:
;;; This file contains a minimal configuration for a non master site ;;; replica. A non master site must know who it is, and who the master ;;; site or at least one other site that it send it join request to. ;;; # The master site. ;;; At minimum we need the site-name, the hostname, and the client and ;;; data ports. (:site-define (:site-name master) (:hostname valhalla) (:client-port 2000) (:data-port 2001)) ; The second site. (:site-define (:site-name inferno) (:hostname inferno) (:client-port 2000) (:data-port 2001)) ;;; Initially the group contains the `master' site and one other site. ;;; At minimum we need a group name and a site. ;;; (:group-define (:group-name mirror-1) (:site master) (:site inferno) (:ping-period 15.0) (:bandwidth-period 60.0) (:estimates-period 30.0) (:update-period 120) (:master-site master)) ;;; Set some parameters for the local site. ;;; At minimum, we need to export the data. (:local (:export-command /homes/dante/mirrord/inferno/bin/filer) (:site-purge-period 600))
A few things to keep in mind about update functions. They should be as fast as possible. The faster data can be exported from the daemon, the more data it can handle. The export command should not generate any output on the standard output or standard error. Any such output will be routed to `/dev/null'.
The command will run the the root directory by default.
#!/bin/sh # This simple export scripts simply cats the stdin to a file in order to # remove the data from the floodd as quickly as possible. It then # runs mirror on the resulting file. prefix=${HOME}/replica HOSTNAME=`hostname` TMPFILE=./.tmp.$$ client_port=9000 cd ${prefix}/$HOSTNAME/data cat > ${TMPFILE} ../bin/mirror --commands --execute --distribute --flood `hostname`:${client_port} < ${TMPFILE} >> log.update 2>&1 rm -f ${TMPFILE} exit 0