World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

A PARALLEL PROBABILISTIC SYSTEM-LEVEL FAULT DIAGNOSIS APPROACH FOR LARGE MULTIPROCESSOR SYSTEMS

    In this paper, we present a system-level fault identification algorithm, using a parallel genetic algorithm, for diagnosing faulty nodes in large heterogeneous systems. The algorithm is based on a probabilistic model where individual node fails with an a priori probability p. The assumptions concerning test outcomes are the same as in the PMC model, that is, fault-free testers always give correct test outcomes and faulty testers are totally unpredictable. The parallel diagnosis algorithm was implemented and simulated on randomly generated large systems. The proposed parallelization is intended to speed up the performance of the evolutionary diagnosis approach, hence reducing the computation time by evolving various sub-populations in parallel. Simulation results are provided showing that the parallel diagnosis did improve the efficiency of the evolutionary diagnosis approach, in that it allowed faster diagnosis of faulty situations, making it a viable alternative to existing techniques of diagnosis. Moreover, the evolutionary approach still provide good results even when extreme non-diagnosable faulty situations are considered.

    References

    • M. Barborak, M. Malek and A. Dahbura, ACM Computing Surveys 25(2), 171 (1993), DOI: 10.1145/152610.152612. Crossref, ISIGoogle Scholar
    • F. Barsi, F. Grandoni and P. Maestrini, IEEE Trans. on Computers C 25, 585 (1976). ISIGoogle Scholar
    • D. M. Blough. Fault Detection and Diagnosis in Multiprocessor Systems. PhD thesis, The Johns Hopkins University, Baltimore, Md., 1988 . Google Scholar
    • D. M. Blough and A. Pelc, IEEE Trans. on Computers 42(2), 205 (1993), DOI: 10.1109/12.204793. Crossref, ISIGoogle Scholar
    • D. M. Blough, G. F. Sullivan and G. M. Masson, IEEE Trans. on Computers 41(9), 1126 (1992), DOI: 10.1109/12.165394. Crossref, ISIGoogle Scholar
    • M. Blount, Probabilistic Treatment of Diagnosis in Digital Systems, Proc. 7th Int. Symp. on Fault-Tolerant Comput. (IEEE Comput. Sot. Publ., 1977) pp. 72–77. Google Scholar
    • D. A.   Coley , An Introduction to Genetic Algorithms for Scientists and Engineers ( World Scientific , Singapore; River Edge, NJ ) . Google Scholar
    • A. T. Dahbura, IEEE Trans. on Computers C 35, (1986). Google Scholar
    • L. N. de Castro and F. J. Von Zuben. Artificial Immune Systems: Part I - Basic Theory and Applications. Technical Report Technical Report RT DCA 01/99, 1999 . Google Scholar
    • M. Elhadef, Fault Identification in Probabilistically Diagnosable Heterogeneous Systems: A Genetic Approach, Proc. of the 5th Int. Conf. on Principles of Distributed Systems (2001) pp. 55–70. Google Scholar
    • M. Elhadef and B. Ayeb, An Evolutionary Algorithm for Identifying Faults in t-Diagnosable Systems, Proc. of the 19th IEEE Symp. on Reliable Distributed Systems (SRDS-2000) (2000) pp. 74–83. Google Scholar
    • M. Elhadef and D. A. Coley, Adaptive Mutation for Semi-Separable Problems, Proc. of the Genetic and Evolutionary Computation Conference (GECCO-2001) (2001) pp. 306–312. Google Scholar
    • D. B.   Fogel , Evolutionary Computation. Towards a New Philosophy of Machine Intelligence , 2nd edn. ( IEEE Press , New York , 2000 ) . Google Scholar
    • A.   Geist et al. , PVM: Parallel Virtual Machine: A Users' Guide and Tutorial for Network Parallel Computing , Scientific and Engineering Computation ( The MIT Press , 1994 ) . CrossrefGoogle Scholar
    • V. S.   Gordon and D.   Whitley , Serial and Parallel Genetic Algorithms as Function Optimizers , Proc. of the 5th Int. Conf. on Genetic Algorithms ( 1993 ) . Google Scholar
    • T. Kameda, S. Toida and F. J. Allan, Inform. Contr. 29, 141 (1975). CrossrefGoogle Scholar
    • Zdenek   Konfrt , Parallel Genetic Algorithms: Advances, Computing Trends, Applications and Perspectives , Proc. of 18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Workshop ( 2004 ) . Google Scholar
    • S. Kreutzer and S. L. Hakimi, Adaptive Fault Identification in Two New Diagnostic Models, Proc. 21st Allerton Conf. on Commun., Cont. and Comput. (Univ. of Illinois, Urbana, Ill, 1983) pp. 353–362. Google Scholar
    • S. Lee. Probabilistic Multiprocessor and Multicomputer Diagnosis. PhD thesis, The Univ. of Michigan, Ann Arbor, Mich., 1990 . Google Scholar
    • S. Lee and K. G. Shin, ACM Computing Surveys 26(1), 121 (1994), DOI: 10.1145/174666.174669. Crossref, ISIGoogle Scholar
    • S. N. Maheshwari and S. L. Hakimi, IEEE Trans. on Computers C 25, 228 (1976). ISIGoogle Scholar
    • P. Moscato, New Ideas in Optimization, eds. D. Corne, M. Dorigo and F. Glover (McGraw-Hill, 1999) pp. 219–234. Google Scholar
    • A. Pelc., IEEE Trans. on Electron. Comput. 40(11), 1271 (1991). Crossref, ISIGoogle Scholar
    • A. Pecl., IEEE Trans. on Computers 47(3), 298 (1998). ISIGoogle Scholar
    • F. P. Preparata, G. Metze and R. T. Chien, IEEE Trans. on Electron. Comput. 16(6), (1967). Google Scholar
    • A. Sengupta and A. T. Dahbura, IEEE Trans. on Computers 41(11), 1386 (1992), DOI: 10.1109/12.177309. Crossref, ISIGoogle Scholar
    • G. Sullivan, A Polynomial Time Algorithm for Fault Diagnosability, Proc. 25th Ann. Symp. on Foundations of Comput. Sci. (IEEE Comput. Soc. Publ., 1984) pp. 148–156. Google Scholar
    • Matthew Wall. A C++ Library of Genetic Algorithm Components. http://lancet.mit.edu/ga/ . Google Scholar