Parallel-Based Techniques for Managing and Analyzing the Performance on Semantic Graph
Abstract
In recent years, data are generated rapidly that advanced the evolving of the linked data. Modern data are globally distributed over the semantically linked graphs. The nature of the distributed data over the semantic graph raised new demands on further investigation on improving performance on the semantic graphs. In this work, we analyzed the time latency as an important factor to be further investigated and improved. We evaluated the parallel computing on these distributed data in order to better utilize the parallelism approaches. A federation framework based on a multi-threaded environment supporting federated SPARQL query was introduced. In our experiments, we show the achievability and effectiveness of our model on a set of real-world quires through real-world Linked Open Data cloud. Significant performance improvement has noticed. Further, we highlight short-comings that could open an avenue in the research of federated queries. Keywords: Semantic web; distributed query processing; query federation; linked data; join methods.
References
- 1. J. DomingueD. FenselJ. A. Hendler (Eds.), Handbook of Semantic Web Technologies (Springer Berlin Heidelberg, Berlin, Heidelberg, 2011). Crossref, Google Scholar
- 2. , A Semantic Web Primer, 3rd edn. (MIT Press, 2012). Google Scholar
- 3. , Web evolution — the shift from information publishing to reasoning, International Journal of Artificial Intelligence and Applications (IJAIA) 8(6) (2017) 11–28. Crossref, Google Scholar
- 4. RDF — Semantic Web Standards. [Online]. Available: http://www.w3.org/RDF/. [Accessed: 01-Mar-2015]. Google Scholar
- 5. O. Lassila and R. R. Swick, Resource Description Framework (RDF) Model and Syntax Specification, Recommendation, World Wide Web Consortium, 2009 [Online]. Available: https://www.w3.org/TR/1999/REC-rdf-syntax-19990222/. Google Scholar
- 6. World Wide Web Consortium, Basic Federated Query — SPARQL Working Group. [Online]. Available: https://www.w3.org/2009/sparql/wiki/Feature:BasicFederatedQuery. [Accessed: 23-May-2018]. Google Scholar
- 7. M. Acosta, M.-E. Vidal, T. Lampo, J. Castillo and E. Ruckhaus, ANAPSID: An adaptive query processing engine for SPARQL endpoints (Springer, Berlin, Heidelberg, 2011), pp. 18–34. Google Scholar
- 8. O. Görlitz and S. Staab, Federated data management and query optimization for linked open data (Springer, Berlin, Heidelberg, 2011), pp. 109–137. Google Scholar
- 9. A. Schwarte, P. Haase, K. Hose, R. Schenkel and M. Schmidt, FedX: Optimization techniques for federated query processing on linked data, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011). Google Scholar
- 10. R. Cyganiak, H. Stenzhorn, R. Delbru, S. Decker and G. Tummarello, Semantic sitemaps: Efficient and flexible access to datasets on the semantic web, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 5021 LNCS, pp. 690-704 (2008). Google Scholar
- 11. , Multicore and GPU Programming: An Integrated Approach. Google Scholar
- 12. , An overview on execution strategies for linked data queries, Datenbank-Spektrum (2013). Crossref, Google Scholar
- 13. , Federated query processing on linked data: A qualitative survey and open challenges, Knowledge Engineering Review (2015). Crossref, ISI, Google Scholar
- 14. , SPARQL query parallel processing: A survey, in Proceedings — 2017 IEEE 6th International Congress on Big Data, BigData Congress 2017 (
2017 ). Google Scholar - 15. N. A. Rakhmawati, J. Umbrich, M. Karnstedt, A. Hasnain and M. Hausenblas, A comparison of federation over SPARQL endpoints frameworks, in Communications in Computer and Information Science (2013). Google Scholar
- 16. B. Quilitz and U. Leser, Querying distributed RDF data sources with SPARQL, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008). Google Scholar
- 17. M. Saleem and A. C. Ngonga Ngomo, HiBISCuS: Hypergraph-based source selection for SPARQL endpoint federation, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014). Google Scholar
- 18. , SPLENDID: SPARQL endpoint federation exploiting voiD descriptions, in CEUR Workshop Proceedings (2011). Google Scholar
- 19. , LHD: Optimising linked data query processing using parallelisation, in CEUR Workshop Proceedings (2013). Google Scholar
- 20. X. Li, Z. Niu and C. Zhang, Towards efficient distributed SPARQL queries on linked data, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014). Google Scholar
- 21. , Efficient optimization of multiple SPARQL queries, IOSR J. Comput. Eng. 8(6) (2013) 97–101. Crossref, Google Scholar
- 22. , Composing data parallel code for a SPARQL graph engine, in Proceedings — SocialCom/PASSAT/BigData/EconCom/BioMedCom 2013 (
2013 ). Google Scholar - 23. N. A. Rakhmawati, J. Umbrich, M. Karnstedt, A. Hasnain and M. Hausenblas, Querying over Federated SPARQL Endpoints — A State of the Art Survey (June 2013). Google Scholar
- 24. , Federated data management and query optimization for linked open data, Stud. Comput. Intell. (2011). Crossref, Google Scholar
- 25. , A fine-grained evaluation of SPARQL endpoint federation systems, in Semantic Web (2016). Crossref, ISI, Google Scholar
- 26. , Parallel processing SPARQL theta join on large scale RDF Graphs, in Proceedings, 2018 IEEE Global Communications Conference, GLOBECOM 2018 (
2018 ). Google Scholar - 27. , TripleFCA: FCA-based approach to enhance semantic web data management, in 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Vol. 1 (IEEE, 2016), pp. 625–630. Crossref, Google Scholar
- 28. , SemStore: A semantic-preserving distributed RDF triple store, in CIKM 2014 – Proceedings of the 2014 ACM International Conference on Information and Knowledge Management (
2014 ). Google Scholar - 29. , A distributed graph engine for web scale RDF data, Proc. VLDB Endow. (2013). Crossref, Google Scholar


