University of Illinois Chicago
Browse

Deep Inside the Tables: Semantic Expressiveness of Semi-Structured Data

Download (3.74 MB)
thesis
posted on 2016-11-05, 00:00 authored by Venkat Raghavan Ganesh Sekar
The web contains huge amount of semi-structured data in the form of tables and spreadsheets that are pertinent for various statistical data analysis or visualization. Manual processing of these tabular data is tedious because of their heterogeneity in structure, concept and metadata. Further, much of the information present in them do not have explicit metadata introducing difficulties in understanding the table semantics which is critical to automatically process these data and to leverage the data integration process. In this thesis, we (a) study in-depth about semi-structured (tabular) data on the web; (b) discuss the complexities in processing them; (c) propose automatic methods to abstract their semantics by annotating various features inside the tables; (d) introduce algorithms to construct a semantic graph by resolving different levels of heterogeneities. We evaluate our approach on a set of highly complex tables retrieved from different domains and also discuss about the impact of our work in practical scenarios and in the field of Semantic Web.

History

Advisor

Cruz, Isabel F.

Department

Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

  • Masters

Committee Member

Ziebart, Brian Palmonari, Matteo

Submitted date

2014-08

Language

  • en

Issue date

2014-10-28

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC