A domain partition model approach to the online fault recovery of FPGA-based reconfigurable systems

L. H. Shang, M. Zhou, Y. Hu, E. F. Yang

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Field programmable gate arrays (FPGAs) are widely used in reliability-critical systems due to their reconfiguration ability. However, with the shrinking device feature size and increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation. To improve the reliability of FPGA-based reconfigurable systems, a permanent fault recovery approach using a domain partition model is proposed in this paper. In the proposed approach, the fault-tolerant FPGA recovery from faults is realized by reloading a proper configuration from a pool of multiple alternative configurations with overlaps. The overlaps are presented as a set of vectors in the domain partition model. To enhance the reliability, a technical procedure is also presented in which the set of vectors are heuristically filtered so that the corresponding small overlaps can be merged into big ones. Experimental results are provided to demonstrate the effectiveness of the proposed approach through applying it to several benchmark circuits. Compared with previous approaches, the proposed approach increased MTTF by up to 18.87%.
Original languageEnglish
Pages (from-to)290-299
Number of pages10
JournalIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
VolumeE94A
Issue number1
DOIs
Publication statusPublished - Jan 2011

Fingerprint

Reconfigurable Systems
Field Programmable Gate Array
Field programmable gate arrays (FPGA)
Fault
Recovery
Partition
Overlap
Set of vectors
Electromigration
Configuration
Shrinking
Reconfiguration
Fault-tolerant
Model
Die
Radiation
Benchmark
Networks (circuits)
Alternatives
Experimental Results

Keywords

  • fault-tolerance
  • reconfigurable systems
  • FPGAs
  • fault-recovery
  • reliability
  • domain partition
  • model approach
  • online fault recovery

Cite this

@article{34d6087c77cb4ab99fe417d19507d9cf,
title = "A domain partition model approach to the online fault recovery of FPGA-based reconfigurable systems",
abstract = "Field programmable gate arrays (FPGAs) are widely used in reliability-critical systems due to their reconfiguration ability. However, with the shrinking device feature size and increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation. To improve the reliability of FPGA-based reconfigurable systems, a permanent fault recovery approach using a domain partition model is proposed in this paper. In the proposed approach, the fault-tolerant FPGA recovery from faults is realized by reloading a proper configuration from a pool of multiple alternative configurations with overlaps. The overlaps are presented as a set of vectors in the domain partition model. To enhance the reliability, a technical procedure is also presented in which the set of vectors are heuristically filtered so that the corresponding small overlaps can be merged into big ones. Experimental results are provided to demonstrate the effectiveness of the proposed approach through applying it to several benchmark circuits. Compared with previous approaches, the proposed approach increased MTTF by up to 18.87{\%}.",
keywords = "fault-tolerance, reconfigurable systems, FPGAs, fault-recovery, reliability, domain partition, model approach, online fault recovery",
author = "Shang, {L. H.} and M. Zhou and Y. Hu and Yang, {E. F.}",
year = "2011",
month = "1",
doi = "10.1587/transfun.E94.A.290",
language = "English",
volume = "E94A",
pages = "290--299",
journal = "IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences",
issn = "0916-8508",
number = "1",

}

TY - JOUR

T1 - A domain partition model approach to the online fault recovery of FPGA-based reconfigurable systems

AU - Shang, L. H.

AU - Zhou, M.

AU - Hu, Y.

AU - Yang, E. F.

PY - 2011/1

Y1 - 2011/1

N2 - Field programmable gate arrays (FPGAs) are widely used in reliability-critical systems due to their reconfiguration ability. However, with the shrinking device feature size and increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation. To improve the reliability of FPGA-based reconfigurable systems, a permanent fault recovery approach using a domain partition model is proposed in this paper. In the proposed approach, the fault-tolerant FPGA recovery from faults is realized by reloading a proper configuration from a pool of multiple alternative configurations with overlaps. The overlaps are presented as a set of vectors in the domain partition model. To enhance the reliability, a technical procedure is also presented in which the set of vectors are heuristically filtered so that the corresponding small overlaps can be merged into big ones. Experimental results are provided to demonstrate the effectiveness of the proposed approach through applying it to several benchmark circuits. Compared with previous approaches, the proposed approach increased MTTF by up to 18.87%.

AB - Field programmable gate arrays (FPGAs) are widely used in reliability-critical systems due to their reconfiguration ability. However, with the shrinking device feature size and increasing die area, nowadays FPGAs can be deeply affected by the errors induced by electromigration and radiation. To improve the reliability of FPGA-based reconfigurable systems, a permanent fault recovery approach using a domain partition model is proposed in this paper. In the proposed approach, the fault-tolerant FPGA recovery from faults is realized by reloading a proper configuration from a pool of multiple alternative configurations with overlaps. The overlaps are presented as a set of vectors in the domain partition model. To enhance the reliability, a technical procedure is also presented in which the set of vectors are heuristically filtered so that the corresponding small overlaps can be merged into big ones. Experimental results are provided to demonstrate the effectiveness of the proposed approach through applying it to several benchmark circuits. Compared with previous approaches, the proposed approach increased MTTF by up to 18.87%.

KW - fault-tolerance

KW - reconfigurable systems

KW - FPGAs

KW - fault-recovery

KW - reliability

KW - domain partition

KW - model approach

KW - online fault recovery

UR - http://www.scopus.com/inward/record.url?scp=78650961387&partnerID=8YFLogxK

U2 - 10.1587/transfun.E94.A.290

DO - 10.1587/transfun.E94.A.290

M3 - Article

VL - E94A

SP - 290

EP - 299

JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

SN - 0916-8508

IS - 1

ER -