Classification of how processes and channels can fail, used to decide what an algorithm must tolerate. Three top-level classes: omission (didn’t do the thing), arbitrary/Byzantine (did the wrong thing), and timing (did the right thing at the wrong time).
The failures in processes and channels are presented using the following taxonomy:
Omission failures refers to cases where a process or a communication channel fails to perform what is expected to do.
Refers to any type of failure that can occur in a system. Could be due to:
| Class of failure | Affects | Description |
|---|---|---|
| Fail-stop | Process | Process halts and remains halted. Other processes may detect this state |
| Crash | Process | Process halts and remains so. Other processes may not detect this state |
| Omission | Channel | A message inserted in an outgoing message buffer never arrives at the other end’s incoming message buffer |
| Send-omission | Process | Process attempts send but message not placed in outgoing buffer |
| Receive-omission | Process | Message received in incoming buffer but process does not receive it |
| Arbitrary | Process or channel |
Occurs when time limits set on process execution time, message delivery time and clock rate drift. They are particularly relevant to synchronous systems and less relevant to asynchronous systems since the latter usually places no or less strict bounds on timing
| Class of failure | Affects | Description |
|---|---|---|
| Clock | Process | Process’s local clock exceeds the bounds on its rate of drift from real time |
| Performance | Process | Process exceeds the bounds on the interval between two steps |
| Performance | Channel | A message’s transmission takes longer than the stated bound |