Ganesh Sekar_Venkat Raghavan.pdf (3.74 MB)
0/0

Deep Inside the Tables: Semantic Expressiveness of Semi-Structured Data

Download (3.74 MB)
thesis
posted on 05.11.2016 by Venkat Raghavan Ganesh Sekar
The web contains huge amount of semi-structured data in the form of tables and spreadsheets that are pertinent for various statistical data analysis or visualization. Manual processing of these tabular data is tedious because of their heterogeneity in structure, concept and metadata. Further, much of the information present in them do not have explicit metadata introducing difficulties in understanding the table semantics which is critical to automatically process these data and to leverage the data integration process. In this thesis, we (a) study in-depth about semi-structured (tabular) data on the web; (b) discuss the complexities in processing them; (c) propose automatic methods to abstract their semantics by annotating various features inside the tables; (d) introduce algorithms to construct a semantic graph by resolving different levels of heterogeneities. We evaluate our approach on a set of highly complex tables retrieved from different domains and also discuss about the impact of our work in practical scenarios and in the field of Semantic Web.

History

Advisor

Cruz, Isabel F.

Department

Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

Masters

Committee Member

Ziebart, Brian Palmonari, Matteo

Submitted date

2014-08

Language

en

Issue date

28/10/2014

Exports

Categories

Exports