Space Transportation System Clause Samples

Space Transportation System. NASA’s space shuttle has experienced several examples of agreement failures due to incorrect handling of Byzantine faults between its Multiplexer Demultiplexer (MDM) units and its General Purpose Computer (GPC). These faults fall within the class that the shuttle developers called “non-universal I/O error”. The MDMs act as remote I/O concentrators for the GPCs. Data from the MDMs are transferred to the GPCs over data busses that are similar to MIL STD 1553. The GPCs execute redundancy-management algorithms that include a Fault Detection, Isolation, and Recovery (FDIR) function having specific handling for the “non-universal I/O error” class of failure. However, these FDIR algorithms were not correctly designed to handle Byzantine faults. Given that there were four GPCs, the shuttle had sufficient redundancy to tolerate a Byzantine fault, if these FDIR algorithms had been designed correctly. In one of the earliest examples (some 25 years ago), this failure was triggered by a technician putting incorrect terminating resistors on the end of a data bus. Because of the impedance mismatch between the characteristic impedance of the data bus and resistance of the terminating resistors, signals on the data bus were reflected off of the resistors. These reflections caused a standing wave on the data bus. Two of the four GPCs happen to be connected to the data bus at nodes of the standing wave and the other two GPCs were connected to the data bus at anti-nodes of the standing wave (see Figure 8). Because of this, two of the GPCs disagreed with the other two GPCs. It was lucky that this irreconcilable 2:2 disagreement happened in the lab. A more recent example of this problem came closer to causing a disaster. At 12:12 GMT 13 May 2008, a space shuttle was loading its hypergolic fuel for mission Space Transportation System (STS)- 124 when a 3-1 disagreement occurred among its GPCs (GPC 4 disagreed with the other GPCs). Three seconds later, the split became 2-1-1 (GPC 2 now disagreed with GPC 4 and the other two GPCs). This required that the launch be stopped. During the subsequent troubleshooting, the remaining two GPCs disagreed (1-1-1-1 split). See the reports given in [20] and [21]. This was a complete system disagreement. However, none of the GPCs was faulty. The fault was in the FA2 2 MDM. This fault was a crack in a diode. The photomicrographs in Figure 9 show two views of this diode, rotated 90 degrees. The dark wavy line pointed to by the red arrows is the cra...