OPTIMIZING DATA DISTRIBUTION IN DESKTOP GRID PLATFORMS
Abstract
Current infrastructures for Volunteer Computing follow a centralized architecture for data distribution, creating a potential bottleneck when tasks require large input files or the central server has limited bandwidth. In this paper we propose two new data models for Berkeley Open Infrastructure for Network Computing (BOINC): an approach based on the popular BitTorrent protocol; and a Content Delivery Network approach. While the latter remains on a theoretical level, we developed a prototype that adds BitTorrent functionality for task distribution and conducted medium-scale tests of the environment. Our preliminary results indicate that the BitTorrent client had a negligible influence on the BOINC client's computation time. The BOINC server showed an unexpectedly low bandwidth output when seeding the file, as well as spikes on CPU usage. This paper discusses the tests that were performed, how they were evaluated, as well as some improvements that could be made in future research on both approaches.
References
-
David Anderson , BOINC: A System for Public-Resource Computing and Storage , In Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing ( 2004 ) . Google Scholar - Berkeley Open Infrastructure for Network Computing (BOINC). See web site at: http://boinc.berkeley.edu/ . Google Scholar
- Networking, IEEE/ACM Transactions on 11, (2003). Google Scholar
- Computer Networks Journal (2005). Google Scholar
-
M. Izal , Dissecting BitTorrent: Five Months in a Torrents Lifetime , In Proceedings of Passive and Active Measurements (PAM) ( 2004 ) . Google Scholar - Peer-to-Peer Architecture for Data-Intensive Cycle Sharing (P2P-ADICS). Sec web site at: http://ww.p2p-adics.org/ . Google Scholar
-
Bram Cohen , Incentives build robustness in BitTorrent , Proceedings of IPTPS ( 2003 ) . Google Scholar - IEEE Internet Computing 50 (2002), DOI: 10.1109/MIC.2002.1036038. Google Scholar
-
L. Wang , Reliability and security in the CoDeeN content distribution network , Proceedings of the USENIX Annual Technical Conference ( 2004 ) . Google Scholar - Al-Mukaddim Khan Pathan and Rajkumar Buyya. A Taxonomy and Survey of Content Delivery Nctwoks. Technical Report, GRIDS-TR-2007-4, Grid Computing and Distributed Systems Laboratory, The University of Melbourne, Australia, 2006 . Google Scholar
-
Michael J. Freedman , Eric Freudenthal and David Mazires , Democratizing Content Publication with Coral , Proc. 1st USENIX/ACM Symposium on Networked Systems Design and Implementation (NSDI '04) ( 2004 ) . Google Scholar -
David Anderson , Volunteer Computing: Planting the Flag , PCGrid 2007 Workshop ( 2007 ) . Google Scholar -
Derrick Kondo , David P. Anderson and John McLeod VII , Performance Evaluation of Scheduling Policies for Volunteer Computing , 3rd IEEE International Conference on e-Science and Grid Computing ( 2007 ) . Google Scholar - FGCS Future Generation Computer Science (2004). Google Scholar
- Gnutella Project. See web site at: http://www.gnutella.com/ . Google Scholar
-
J. Kubiatowicz , Occanstore: An architecture for global-scale persistent storage , 9th International Conference on Architectural Support for Programming Languages and Operating Systems ( 2000 ) . Google Scholar Alexandre Freire da Silva , Francisco Gatto and Fabio Kon , Cigarra - A Peer-to-Peer Cultural Grid, Proceedings of the FISL Workshop on Free Software (2005) pp. 177–183. Google Scholar-
B. Goldsmith , Enabling Grassroots Distributed Computing with Comp Torrent , Sixth International Workshop on Agents and Peer-to-Peer Computing (AP2PC 2007) . Google Scholar Baohua Wei , G. Fedak and F. Cappello , Scheduling independent tasks sharing large data distributed with BitTorrent, Grid Computing, 2005. The 6thIEEE/ACM International Workshop on (IEEE ComputerSociety, 2005) pp. 219–226. Google Scholar-
C. Briquet , Scheduling data-intensive bags of tasks in P2P grids with bittorrent-enabled data distribution , In Proceedings of the second workshop on Use of P2P, GRID and agents for the development of content networks ( 2007 ) . Google Scholar J. Kim , A. Chandra and J. B. Weissman , Exploiting Heterogeneity for Collective Data Downloading in Volunteer-based Networks, in Proc. of the 7th IEEE International Symposium on Cluster Computing and the Grid (CCGRID07) (2007) pp. 275–282. Google Scholar- IEEE Communications 44, (2006), DOI: 10.1109/MCOM.2006.1678120. Google Scholar
- Akamai Technologies, Inc. See web site at: http://www.akamai.com/ . Google Scholar
- GNU wget: http://www.gnu.org/software/wget/wget.html . Google Scholar
- Computer Networks: The International Journal of Computer and Telecommunications Networking 45(1), 19 (2004). Google Scholar
-
L. Cherkasova and J. Lee , FastReplica: Efficient large file distribution within Content Delivery Networks , Proceedings of the 4th USITS ( 2003 ) . Google Scholar -
J. Li , A performance vs. cost framework for evaluating DHT design tradeoffs under churn , IEEE Conference on Computer Communications (INFOCOM) ( 2005 ) . Google Scholar - Azureus. See web site at: http://azureus.sourceforge.net/ . Google Scholar
- Grid5000. See web site at: http://www.grid5000.fr/ . Google Scholar
- Parallel Computing 30, 817 (2004), DOI: 10.1016/j.parco.2004.04.001. Crossref, ISI, Google Scholar
- libtorrent. See web site at: http://libtorrent.sourceforge.net/ . Google Scholar


