Show simple item record

dc.contributor.advisorF.Cruz, Isabelen_US
dc.contributor.authorMirrezaei, Seyed Imanen_US
dc.date.accessioned2018-02-08T21:31:40Z
dc.date.available2018-02-08T21:31:40Z
dc.date.created2017-12en_US
dc.date.issued2017-09-05en_US
dc.date.submittedDecember 2017en_US
dc.identifier.urihttp://hdl.handle.net/10027/22249
dc.description.abstractDiscovering knowledge from textual sources and subsequently expanding the coverage of knowledge bases like DBpedia or Google’s Knowledge Graph currently requires either extensive manual work or carefully designed open information extractors. An open information extractor (OIE) captures triples from textual resources. Each triple consists of a subject, a predicate/property, and an object. Triples can be mediated via verbs, nouns, adjectives, or appositions. The research that we conducted in the area of OIE resulted on the development of OIE systems, named TRIPLEX and TRIPLEX-ST. We focus on further advancing OIE methods to support the expansion of spatio-temporal information in knowledge bases. TRIPLEX extracts triples from grammatical dependency relations involving noun phrases and modifiers that correspond to adjectives and appositions. TRIPLEX constructs templates that express nounmediated triples during its automatic bootstrapping process, which finds sentences that express nounmediated triples by leveraging Wikipedia. The templates express how noun-mediated triples occur in sentences and include rich linguistic annotations. Finally, the templates can be used to extract triples from previously unseen text. TRIPLEX-ST is a novel information extraction system that can capture spatio-temporal information from text. It extends current open-domain information extraction (OIE) systems in several dimensions, including the ability to extract facts associated with spatio-temporal contexts (i.e., spatio-temporal information that constrains the facts). The system usesWikipedia sentences and triples in existing knowledge bases, such as YAGO, to automatically infer templates during a bootstrapping process. These templates include rich linguistic annotations, and they can be used to extract both facts associated with spatio-temporal contexts and spatio-temporal facts from previously unseen sentences. TRIPLEX-ST also includes syntax-based sentence simplification methods, which contribute to improving extraction effectiveness. Our experiments show that TRIPLEX-ST outperforms a state-of-the-art OIE system on the extraction of spatio-temporal facts. We also show that our approach can accurately extract useful new information, in the form of triples connected to spatio-temporal contexts, using a large Wikipedia dataset.en_US
dc.format.mimetypeapplication/pdfen_US
dc.subjectSpatio-temporal text analysisen_US
dc.subjectopen information extractionen_US
dc.subjectdistantly supervised information extractionen_US
dc.subjecttext mining.en_US
dc.titleAdvancing Open Information Extraction Methods to Enrich Knowledge Basesen_US
dc.typeThesisen_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.grantorUniversity of Illinois at Chicagoen_US
thesis.degree.levelDoctoralen_US
thesis.degree.namePhD, Doctor of Philosophyen_US
dc.contributor.committeeMemberDi Eugenio, Barbaraen_US
dc.contributor.committeeMemberLiu, Bingen_US
dc.contributor.committeeMemberZiebart, Brianen_US
dc.contributor.committeeMemberMartins, Brunoen_US
dc.type.materialtexten_US
dc.contributor.chairF.Cruz, Isabelen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record