Weak Consistency Group Communication



next up previous
Next: Service Replication Up: Autonomous Replication in Wide-Area Previous: Group Communication

Weak Consistency Group Communication

Golding's system provides a mechanism to allow a collection of principals to broadcast messages to each other using a weak consistency model such that divergent state eventually converges. His framework includes message delivery, message ordering, group membership routines, and support for applications that need these services.

The protocol he introduces to accomplish these tasks he calls Timestamped Anti-Entropy (TSAE). TSAE uses delayed communication between principals to efficiently batch messages in a queue and deliver them later. Pairs of principals periodically take part in an anti-entropy session where they contact each other to exchange messages. This system scales well because these anti-entropy sessions can take place in parallel.

A principal must be able to survive temporary failures and host crashes, and must be equipped with stable storage to record information between network failures.

To efficiently exchange information with other members of the group, principals must be able to locate group members near them. This requires some sort of name or location service, as well as a performance prediction service [15] to order principals by locality. Performance prediction uses latency, failure rate, and bandwidth to determine expected performance. It does not use the number of network hops or expected backbone savings, and therefore does not require a prior knowledge of network topology. It can be argued that fast networks make expected backbone savings a more important metric than user latency, but it could also be argued that latency is the only viable metric that can be easily determined.

Performance prediction is built on top of the ping program, using ICMP echo to test the latency between hosts. His results are encouraging: for most hosts previous latency is a good predictor of future performance. Only for hosts very far away or some hosts overseas is this not necessarily the case. For the most part, however, the latency variance is small, and as we show in section gif, latency is proportional to the number of network hops. We focus primarily on geography in our own work because determining latency between two arbitrary hosts requires running ping from one of them, and such a step would prohibit one server from calculating the optimal solution without enormous overhead.

The TSAE protocol guarantees that any knowledge owned by one principal will eventually spread to all other principals, using the anti-entropy session. This session is quite simple: one principal picks another principal and establishes a session between them. The two principals then exchange any messages that the other does not know about, and disconnect. This protocol is guaranteed to converge eventually so that inconsistency is always finitely bounded.

The efficiency of the protocol is determined by the session partner selection process. Golding presents simulation results for a variety of different selection policies, such as random selection or distance- based selection, and considers their impact on network traffic and propagation time. The time to propagate a message using a uniform random partner selection policy scales well with the size of the group. Five sites takes about four mean entropy-time intervals, while 160 sites only takes twice that. The best time is achieved, however, with age-based partner selection. Fixed topologies in general are poor, although adaptive topologies work well.

Network traffic is another important factor. Each message is sent exactly once to each principal; therefore the total amount of network traffic generated is determined by the network topology and the partner selection policy. He introduced new partner selection policies to try and reduce the amount of total network traffic such as cost-biased and cost-squared biased.

Golding's TSAE protocol is already finding use in the Internet; besides the refdbms system [17] that he helped design, Danzig et al have explored using the TSAE to help handle massively replicated services [11]. This research is particularly important because a geographical push-caching Web server will need a system to track available servers, and it may be necessary to distribute this service to insure that it is scalable.



next up previous
Next: Service Replication Up: Autonomous Replication in Wide-Area Previous: Group Communication



James Gwertzman
Wed Apr 12 00:26:11 EDT 1995