BANERJEE-DISSERTATION-2017.pdf (2.26 MB)

Cookie-Cutter: Achieving Defect/Fault Tolerance For Large-Scale Systems with Highly Unreliable Components

Download (2.26 MB)
thesis
posted on 08.02.2018 by Soumya Banerjee
The work proposes generalized “cookie-cutter” defect and fault tolerance approaches for nano-scale systems. The systems under considerations include Parallel Prefix Adders (PPA’s) and Large-scale Many-processor Systems. First, we show a systematic approach for designing defect tolerant PPA’s. It does not only allow the designers to select which adder to use in the design, but also gives the designers freedom to select the proper reliability-hardware trade-off point for the design. In addition, using the same systematic approach, we show how highly customizable Sparse PPA’s can be designed. For design of fault tolerant Many-Processor systems, we propose a novel 2-layered Router-Processing Element (Router-PE) model, which supports repairs of PE faults through a “chain of replacements”: the faulty PE is replaced by a PE in the neighborhood, which is in turn replaced by another PE nearby. This reconfiguration goes on until a spare is reached. We show that such a repair methodology, combined with the model, provides a systematic design approach for Many-Processor Systems facilitating simple lightweight repairs on-the-fly. Physical implementation of such system does not require significantly long interconnect overhead to deliver reasonable reliability.

History

Advisor

Rao, Wenjing

Chair

Rao, Wenjing

Department

Electrical and Computer Engineering

Degree Grantor

University of Illinois at Chicago

Degree Level

Doctoral

Committee Member

Zefran, Milos Zhu, Zhichun Trivedi, Amit Lillis, John

Submitted date

December 2017

Issue date

08/09/2017

Exports

Categories

Exports