Scalable Sender-Based Message Logging Protocol with Little Communication Overhead for Distributed Systems
Abstract
The inherent shortcoming of the conventional Sender-Based Message Logging (SBML) protocols is to require additional control message interactions per application message to satisfy the always-no-orphans condition in case of sequential failures. In this paper, a scalable SBML protocol is introduced to lower the communication overhead by handling a sequence of messages consecutively received by each process before sending as a party. The protocol enables the process to delay the update of their receive sequence numbers to their senders until there comes out the first message it is willing to send, and then perform the collective filling out task with each sender requiring only one control message exchange. Experimental results show that our protocol outperforms the previous one in terms of the number of control messages generated.
References
- 1. , Paradigms for process interaction in distributed programs, ACM Comput. Surv. 23(1) (1991) 49–90. Crossref, Google Scholar
- 2. , Parsec: A parallel simulation environments for complex systems, IEEE Comput. 31(10) (1998) 77–85. Crossref, ISI, Google Scholar
- 3. , A survey of rollback-recovery protocols in message-passing systems, ACM Comput. Surv. 34(3) (2002) 375–408. Crossref, ISI, Google Scholar
- 4. , Sender-based message logging, in Proc. 7th International Symposium on Fault-Tolerant Computing (
Pittsburgh, PA ,July 1987 ), pp. 14–19. Google Scholar - 5. , A recovery scheme for cluster federations using sender-based message logging, J. Comput. Inf. Technol. 19(2) (2011) 127–139. Crossref, Google Scholar
- 6. , Why optimistic message logging has not been used in telecommunications systems, in Proc. 25th International Symposium on Fault-Tolerant Computing (
Washington, DC ,June 1995 ), pp. 459–463. Google Scholar - 7. , Log based recovery with low overhead for large mobile computing systems, J. Inf. Sci. Eng. 29(5) (2013) 969–984. Google Scholar
- 8. , Time, clocks, and the ordering of events in a distributed system, Commun. ACM 21(7) (1978) 558–565. Crossref, ISI, Google Scholar
- 9. , HOPE: A hybrid optimistic checkpointing and selective pessimistic mEssage logging protocol for large scale distributed systems, Future Gener. Comp. Sy. 28(8) (2012) 1217–1235. Crossref, ISI, Google Scholar
- 10. , Hybrid message pessimistic logging. Improving current pessimistic message logging protocols, J. Parallel. Distrib. Comput. 104(C) (2017) 206–222. Crossref, ISI, Google Scholar
- 11. , Active optimistic and distributed message logging for message-passing applications, Concurrency-Pract. Ex. 23(17) (2011) 2167–2178. Crossref, ISI, Google Scholar
- 12. , Improving message logging protocols scalability through distributed event logging, in Proc. 16th International Euro-Par Conference (
Ischia, Italy ,August 2010 ), pp. 511–522. Google Scholar - 13. , Fail-stop processors: An approach to designing fault-tolerant distributed computing systems, ACM Trans. Comput. Syst. 1(3) (1985) 222–238. Crossref, ISI, Google Scholar
- 14. , Message logging in mobile computing, in Proc. 29th International Symposium on Fault-Tolerant Computing (
Madison, Wisconsin ,February 1999 ), pp. 14–19. Google Scholar


