World Scientific
  • Search
  •   
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

Minimizing I/Os in Out-of-Core Task Tree Scheduling

    https://doi.org/10.1142/S0129054122500186Cited by:0 (Source: Crossref)

    Scientific applications are usually described using directed acyclic graphs, where nodes represent tasks and edges represent dependencies between tasks. For some applications, this graph is a tree: each task produces a single result used solely by its parent. The temporary results of each task have to be stored between their production and their use.

    We focus on the case when the data manipulated are very large. Then, during an execution, all data may not fit together in memory. In such a case, some data have to be temporarily written to disk and evicted from memory. These data are later read from disk when they are needed for computation.

    These Input/Output operations are very expensive; hence, our goal is to minimize their total volume. The order in which the tasks are processed considerably influences the amount of such Input/Output operations. Finding the schedule which minimizes this amount is an open problem that we revisit in this paper.

    We first formalize and generalize known results, and prove that existing solutions can be arbitrarily worse than the optimal. We then present an Integer Linear Program to solve it optimally. Finally, we propose a novel heuristic algorithm. We demonstrate its good performance through simulations on both synthetic and realistic trees built from actual scientific applications.

    Communicated by Yong Zhang

    References

    • 1. E. Agullo, P. R. Amestoy, A. Buttari, A. Guermouche, J.-Y. L’Excellent and F.-H. Rouet, Robust memory-aware mappings for parallel multifrontal factorizations, SIAM J. Scientific Comput. 38(3) (2016) C256–C279. Crossref, Web of ScienceGoogle Scholar
    • 2. E. Agullo, On the out-of-core factorization of large sparse matrices, PhD thesis, École normale supérieure de Lyon, France (2008). Google Scholar
    • 3. P. R. Amestoy, I. S. Duff, J. Koster and J.-Y. L’Excellent, A fully asynchronous multifrontal solver using distributed dynamic scheduling, SIAM J. Matrix Anal. Appl. 23(1) (2001) 15–41. Crossref, Web of ScienceGoogle Scholar
    • 4. P. R. Amestoy, A. Guermouche, J.-Y. L’Excellent and S. Pralet, Hybrid scheduling for the parallel solution of linear systems, Parallel Comput. 32(2) (2006) 136–156. Crossref, Web of ScienceGoogle Scholar
    • 5. W. G. Aulbur, Parallel implementations of quasiparticle calculations of semiconductors and insulators, PhD thesis, The Ohio State University (1996). Google Scholar
    • 6. L. A. Belady, A study of replacement algorithms for a virtual-storage computer, IBM Syst. J. 5(2) (1966) 78–101. CrossrefGoogle Scholar
    • 7. J. R. Correa and A. S. Schulz, Single-machine scheduling with precedence constraints, Math. Operat. Res. 30(4) (2005) 1005–1021. Crossref, Web of ScienceGoogle Scholar
    • 8. T. A. Davis, Direct Methods for Sparse Linear Systems, Fundamentals of Algorithms, Vol. 2 (Society for Industrial and Applied Mathematics, 2006). CrossrefGoogle Scholar
    • 9. D. E. Dolan and J. J. Moré, Benchmarking optimization software with performance profiles, Math. Program. 91(2) (2002) 201–213. Crossref, Web of ScienceGoogle Scholar
    • 10. L. Eyraud-Dubois, L. Marchal, O. Sinnen and F. Vivien, Parallel scheduling of task trees with limited memory, TOPC 2(2) (2015) 13. CrossrefGoogle Scholar
    • 11. M. Frigo, C. E. Leiserson, H. Prokop and S. Ramachandran, Cache-oblivious algorithms, Foundations of Computer Science, 1999. 40th Annual Symposium on, IEEE (IEEE Computer Society, New York, NY, 1999), pp. 285–297. CrossrefGoogle Scholar
    • 12. M. S. Hybertsen and S. G. Louie, Electron correlation in semiconductors and insulators: Band gaps and quasiparticle energies, Phys. Rev. B 34(8) (1986) 5390. Crossref, Web of ScienceGoogle Scholar
    • 13. M. Jacquelin, L. Marchal, Y. Robert and B. Ucar, On optimal tree traversals for sparse matrix factorization, Proc. 25th IEEE Int. Parallel and Distributed Processing Symposium (IPDPS’11) (IEEE Computer Society, Los Alamitos, CA, USA, 2011), pp. 556–567. CrossrefGoogle Scholar
    • 14. H. Jia-Wei and H. T. Kung, I/o complexity: The red-blue pebble game, Proc. 13th Ann. ACM Symp. Theory of Computing, STOC ’81 (Association for Computing Machinery, New York, NY, USA, 1981), p. 326–333. CrossrefGoogle Scholar
    • 15. C.-C. Lam, T. Rauber, G. Baumgartner, D. Cociorva and P. Sadayappan, Memory-optimal evaluation of expression trees involving large objects, Computer Languages, Systems & Structures 37(2) (2011) 63–75. CrossrefGoogle Scholar
    • 16. M. Lee, P. Michaud, J. S. Sim and D. Nyang, A simple proof of optimality for the MIN cache replacement policy, Inf. Process. Lett. 116(2) (2016) 168–170. Crossref, Web of ScienceGoogle Scholar
    • 17. T. J. Lee and G. E. Scuseria, Achieving chemical accuracy with coupled-cluster theory, Quantum Mechanical Electronic Structure Calculations with Chemical Accuracy, ed. S. R. Langhoff (Springer, New York, NY, 1995), pp. 47–108. CrossrefGoogle Scholar
    • 18. J. W. Liu, On the storage requirement in the out-of-core multifrontal method for sparse factorization, ACM Trans. Mathematical Software (TOMS) 12(3) (1986) 249–264. Crossref, Web of ScienceGoogle Scholar
    • 19. J. W. H. Liu, An application of generalized tree pebbling to sparse matrix factorization, SIAM J. Algebraic Discrete Methods 8(3) (1987) 375–395. Crossref, Web of ScienceGoogle Scholar
    • 20. J. W. H. Liu, The role of elimination trees in sparse factorization, SIAM J. Matrix Analysis and Applications 11(1) (1990) 134–172. Crossref, Web of ScienceGoogle Scholar
    • 21. E. Mäkinen, Generating random binary treesa survey, Inform. Sci. 115(1–4) (1999) 123–136. Crossref, Web of ScienceGoogle Scholar
    • 22. J. M. L. Martin, Benchmark Studies on Small Molecules, Encyclopedia of Computational Chemistry, Vol. 1 (Wiley, New York, NY, 1998), pp. 115–128. Google Scholar
    • 23. P. A. Papp and R. Wattenhofer, On the hardness of red-blue pebble games, Proc. 32nd ACM Symp. Parallelism in Algorithms and Architectures (Association for Computing Machinery, 2020), pp. 419–429. CrossrefGoogle Scholar
    • 24. E. Saule, H. M. Aktulga, C. Yang, E. G. Ng and Ü. V. Çatalyürek, An out-of-core task-based middleware for data-intensive scientific computing, Handbook on Data Centers, eds. S. U. Khan and A. Y. Zomaya (Springer, New York, NY, 2015), pp. 647–667. CrossrefGoogle Scholar
    • 25. R. Sethi and J. Ullman, The generation of optimal code for arithmetic expressions, J. ACM 17(4) (1970) 715–728. Crossref, Web of ScienceGoogle Scholar
    • 26. S. Toledo, A survey of out-of-core algorithms in numerical linear algebra, External Memory Algorithms, Proceedings of a DIMACS Workshop (American Mathematical Society, New Brunswick, New Jersey, USA, 1998), pp. 161–180. Google Scholar
    Remember to check out the Most Cited Articles!

    Check out these Handbooks in Computer Science