Xem mẫu
PROPRIETARY MATERIAL. © 2007 The McGrawHill Companies, Inc. All rights reserved. No part of this PowerPoint slide may be displayed, reproduced or distributed in any form or by any means, without the prior written permission of the publisher, or used beyond the limited distribution to teachers and educators permitted by McGrawHill for their individual course preparation. If you are a student using this PowerPoint slide, you are using it without permission.
Chapter 18: Recovery and Fault Tolerance
Dhamdhere: Operating Systems— A ConceptBased Approach, 2 ed
Slide No: 1 Copyright © 2008
Introduction
• A fault may have several consequences
– It may damage state of some data or processes, leading to
* Malfunctioning of a server
* Non-availability of resources and services * Disruption of system operation
– Its consequences can be avoided using three approaches * Recovery
Some data or processes are rolled back to previous states * Fault tolerance
Provides un-interrupted operation of a system despite faults * Resiliency
Reduces cost of re-execution following a fault
– Recovery is the generic term for all three approaches
Chapter 18: Recovery and Fault Tolerance
Dhamdhere: Operating Systems— A ConceptBased Approach, 2 ed
Slide No: 2 Copyright © 2008
Faults and failures
• A fault damages the state of a system
– We say it causes an error in state of a system
– It leads to unexpected behavior, which we call a failure – Recovery restores the system to an error-free state
Chapter 18: Recovery and Fault Tolerance
Dhamdhere: Operating Systems— A ConceptBased Approach, 2 ed
Slide No: 3 Copyright © 2008
Recovery after a fault
• System operation is initiated in state S0 at time 0
• A fault occurs at time t1; a failure is detected at time ti when state is Si’ • Recovery puts the system into a new state Snew , which is errorfree
Chapter 18: Recovery and Fault Tolerance
Dhamdhere: Operating Systems— A ConceptBased Approach, 2 ed
Slide No: 4 Copyright © 2008
Classes of faults
• A fault model describes properties of a fault
– System fault is a system crash caused by a power outage or component fault
* Amnesia fault
The system loses its state completely * Fail-stop fault
The system stops operating when a fault occurs
This property permits an error in system state to be corrected * Byzantine fault
A process suffering this fault behaves maliciously * Storage fault
Bad block on a store medium
Chapter 18: Recovery and Fault Tolerance
Dhamdhere: Operating Systems— A ConceptBased Approach, 2 ed
Slide No: 5 Copyright © 2008
...
- tailieumienphi.vn
nguon tai.lieu . vn