Implementing with Bounded Messages on a Network of ADD Channels
Abstract
We present an implementation of the eventually perfect failure detector from the original hierarchy of the Chandra-Toueg [3] oracles on an arbitrary partitionable network composed of unreliable channels that can lose and reorder messages. Prior implementations of have assumed different partially synchronous models ranging from bounded point-to-point message delay and reliable communication to unbounded message size and known network topologies. We implement under very weak assumptions on an arbitrary, partitionable network composed of Average Delayed/Dropped (ADD) channels [11] to model unreliable communication. Unlike older implementations, our failure detection algorithm uses bounded-sized messages to eventually detect all nodes that are unreachable (crashed or disconnected) from it.
References
- 1. , Using the heartbeat failure detector for quiescent reliable communication and consensus in partitionable networks, Theoretical Computer Science 220(1) (1999) 3–30. Crossref, ISI, Google Scholar
- 2. , A performance study of consensus algorithms in omission and crash-recovery scenarios, PDP 2014, 240–243. Google Scholar
- 3. , Unreliable failure detectors for reliable distributed systems, Journal of the ACM 43 (1996) 225–267. Crossref, ISI, Google Scholar
- 4. , On the quality of service of failure detectors, IEEE Trans. Computers 51(5) (2002) 561–580. Crossref, ISI, Google Scholar
- 5. , Impossibility of distributed consensus with one faulty process, J. ACM 32(2) (1985) 374–382. Crossref, ISI, Google Scholar
- 6. , Perfect failure detection with very few bits, SSS 2016, 154–169. Google Scholar
- 7. , The failure detector abstraction, ACM Comput. Surv. 43(2) (2011) 9:1–9:40. Crossref, ISI, Google Scholar
- 8. , An efficient failure detector for sparsely connected networks, Parallel and Distributed Computing and Networks 2004, pp. 369–374. Google Scholar
- 9. , Implementing unreliable failure detectors with unknown membership, Inf. Process. Lett. 100(2) (2006) 60–63. Crossref, ISI, Google Scholar
- 10. , A gossip-style failure detection service, in Middleware, eds. N. DaviesS. JochenK. Raymond (Springer, London, 1998). Crossref, Google Scholar
- 11. , Eventually perfect failure detectors using ADD channels, in IEEE Int. Symp. on Parallel and Distributed Processing with Applications (
2007 ), pp. 483–496. Google Scholar


