Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
(N,K) concept fault tolerance
Krol T. IEEE Transactions on Computers35 (4):339-350,1986.Type:Article
Date Reviewed: Nov 1 1987

In this paper the author presents a new redundancy scheme, called the (N,K) concept fault tolerance, which is an extension of the well-known N-Modular Redundancy scheme (NMR). The (N,K) concept is based on (N,K) symbol-error correcting code that is composed of K information symbols and N − K check symbols. When NMR is applied at the system level, the processor, the memory, as well as the bus, are N-fold replicated, and the errors at the bus are corrected by majority voting. In the (N,K) concept, there are also N processors, N memory modules, and N buses. However, the size of data words in memory and bus is K times smaller than that in the processors. Thus, in memory and bus, K out of N data words are used to carry information. In other words, (N,1) symbol code is used in the processor and (N,K) symbol code is used in memory and bus. Compared with the NMR, the hardware cost of memory and bus in the (N,K) concept is K times smaller, although it requires as additional hardware N different encoders that collectively transform the (N,1) code into an (N,K) code. With the large reduction of redundancy in memory and bus, the (N,K) concept fault-tolerant system still has a reliability improvement comparable to that of (N − K+1)-modular redundancy.

The paper also presents an efficient (4,2) symbol-error correcting code that also possesses double bit-error correcting capability. This eliminates the necessity of using a separate bit-error correcting code to protect memory, which is more error prone. The author also shows how an (N,K) concept redundancy system can tolerate errors from external sources.

The (N,K) concept presented in this paper offers an alternative to NMR in the design of fault-tolerant systems. Which approach is better depends on the relative costs of processors, memory modules, and buses. The paper covers both practical and theoretical aspects of fault-tolerant systems. It is relatively hard to read owing to its broad coverage and its use of uncommon expressions. However, the idea is interesting and useful.

Reviewer:  H. Y. H. Chuang Review #: CR110631
Bookmark and Share
 
Error Handling And Recovery (D.2.5 ... )
 
 
Redundant Design (B.3.4 ... )
 
 
Redundant Design (B.5.3 ... )
 
 
Reliability, Availability, And Serviceability (C.4 ... )
 
 
Coding Tools and Techniques (D.2.3 )
 
 
Special-Purpose And Application-Based Systems (C.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Error Handling And Recovery": Date
Error recovery in asynchronous systems
Campbell R., Randell B. IEEE Transactions on Software Engineering SE-12(9): 811-826, 1986. Type: Article
Jul 1 1987
Static analysis to support the evolution of exception structure in object-oriented systems
Robillard M., Murphy G. ACM Transactions on Software Engineering and Methodology 12(2): 191-221, 2003. Type: Article
Nov 25 2003
A component-based design and compositional verification of a fault-tolerant multimedia communication protocol
Hanumantharaya A., Sinha P., Agarwal A. Real-Time Imaging 9(6): 401-422, 2003. Type: Article
Oct 11 2004
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy