posted on 2019-02-01, 00:00authored byMichael J Lewis
Many RDF data systems are able to perform queries on different types of connected data structures for a scalable range of input. Partitioning techniques, graph algorithms, and mem- ory based indexing schemes have been heavily researched and integrated into different data systems, in order to produce faster query results with increasing data sizes and different query types. The focus of this work is on two types of powerful (top tier performance in aggregate processing capacity and bandwidth capacity) clustered systems to show conditionally, and de- finable, time improvements covering dataset preprocessing and query retrieval. Two different algorithmic approaches are used to evaluate query retrieval. One algorithmic approach utilizes a distributed linked data path indexing system to help retrieve queries, the other approach is graph exploration which is finding the linked data at query time according to the connected query patterns. Graph exploration is a common and effective approach used by a number of large scale proprietary RDF systems. In order to implement and evaluate both approaches, the work, called Mantona is developed. Mantona also makes it possible, through generating a preprocessed file cache-file, the ability to evaluate performance based on the contents of the cache-file and the type of query retrieval algorithm used. This dissertation includes a review of effective RDF query systems and shows the implementation and ramifications of creating a cache-file dataset from which the Mantona experiments are conducted over varied processor sizes and query types.
History
Advisor
Johnson, Andrew
Chair
Johnson, Andrew
Department
Computer Science
Degree Grantor
University of Illinois at Chicago
Degree Level
Doctoral
Committee Member
Buy, Ugo
Kshemkalyani, Ajay
Leigh, Jason
Vishwanath, Venkatram