posted on 2020-08-01, 00:00authored byBooma Sowkarthiga Balasubramani
Geospatial data integration involves combining two or more geospatial datasets to facilitate analysis, reasoning, querying, and data visualization. The availability of data in itself is not sufficient to bring the concept of data integration to its full potential. Primary reasons are that data come from disparate sources and are fundamentally heterogeneous. Some prominent challenges in integrating geospatial data includes differences in format, representation, context, structure, events, data models, spatio-temporal resolution, data collection and storage techniques, and the relationship between various system properties in a given region. Also, as with any real-world data, geospatial data are dirty. That is, they are most often erroneous, incomplete, and inconsistent, leading to uncertainty. All these factors affect a data integration system, resulting in imprecise results when the data are analyzed. Therefore, in the process of designing a reliable data integration system, one has to address the problems of dealing with dirty data, as well as the heterogeneity and uncertainty that comes with the data. This dissertation aims at developing geospatial data integration techniques that address several types of heterogeneities in data. First, we introduce a context-aware pre-processing technique that incorporates domain knowledge to resolve any errors and inconsistencies in geospatial data, prior to data integration. Second, we describe Semantic Web-based techniques that use ontologies to handle the integration of geospatial datasets that: (i) exhibit different forms of heterogeneity; (ii) are of different data formats and/or of different categories; and (iii) have different spatio-temporal resolutions. In addition, we also present a tree-based spatial data structure to represent and index heterogeneous geospatial data, along with an algorithm for efficient processing of spatial queries, allowing for some (bounded) uncertainty in the query results. This dissertation describes the background, challenges and approaches in integrating the geospatial data in a seamless manner, and highlight the results of the experiments focusing on a wide range of real-world scenarios.
History
Advisor
Cruz, Isabel F
Chair
Cruz, Isabel F
Department
Computer Science
Degree Grantor
University of Illinois at Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Sistla, Prasad A
DasGupta, Bhaskar
Derrible, Sybil
Trajcevski, Goce