Ganesh Sekar_Venkat Raghavan.pdf (3.74 MB)
Download file

Deep Inside the Tables: Semantic Expressiveness of Semi-Structured Data

Download (3.74 MB)
posted on 05.11.2016, 00:00 authored by Venkat Raghavan Ganesh Sekar
The web contains huge amount of semi-structured data in the form of tables and spreadsheets that are pertinent for various statistical data analysis or visualization. Manual processing of these tabular data is tedious because of their heterogeneity in structure, concept and metadata. Further, much of the information present in them do not have explicit metadata introducing difficulties in understanding the table semantics which is critical to automatically process these data and to leverage the data integration process. In this thesis, we (a) study in-depth about semi-structured (tabular) data on the web; (b) discuss the complexities in processing them; (c) propose automatic methods to abstract their semantics by annotating various features inside the tables; (d) introduce algorithms to construct a semantic graph by resolving different levels of heterogeneities. We evaluate our approach on a set of highly complex tables retrieved from different domains and also discuss about the impact of our work in practical scenarios and in the field of Semantic Web.



Cruz, Isabel F.


Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level


Committee Member

Ziebart, Brian Palmonari, Matteo

Submitted date




Issue date


Usage metrics