University of Illinois Chicago
Browse

Leveraging Succinct Data Structures for the Burrows-Wheeler Mapping of Short Sequence Reads on FPGA

Download (7.63 MB)
thesis
posted on 2019-08-01, 00:00 authored by Guido Walter Di Donato
The advent of the Next Generation Sequencing produced an explosion in the amount of genomic data generated, which resulted in the birth and early development of personalized medicine. In order to boost the research in this field, new bioinformatic tool are needed, which can keep up with the pace of NGS technologies. In this scenario, the aim of this thesis is the design and the implementation of an efficient, easy-to-use short sequence mapper, to be used in various bioinformatic applications. At the core of the proposed tool there is an efficient implementation of a succinct data structure, allowing to compress the genomic data while still providing efficient queries on them. A com- prehensive description of the data encoding scheme is presented in this work, together with the characterization of the proposed data structure in terms of memory utilization and execution time. The resulting sequence mapper is made available through an intuitive web application that guarantees high usability and provides great user experience. Moreover this thesis presents the design of an easily accessible hybrid sequence aligner, leveraging the compression capability of the proposed data structure to fully exploit the highly parallel architecture of FPGAs. A validation of the presented software will be presented, in order to test the reliability of the results it produces. Finally, some consideration about future developments of this project will be proposed.

History

Advisor

Berger-Wolf, Tanya

Chair

Berger-Wolf, Tanya

Department

Bioengineering

Degree Grantor

University of Illinois at Chicago

Degree Level

  • Masters

Degree name

MS, Master of Science

Committee Member

DasGupta, Bhaskar Santambrogio, Marco D.

Submitted date

August 2019

Thesis type

application/pdf

Language

  • en

Issue date

2019-08-23

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC