We present a novel approach to testing fault-tolerant and real-time protoco
l implementations. CESIUM, our testing environment, executes the protocols
in a centralized simulator of the distributed system. It simulates the occu
rrence of inputs and the failure scenarios the protocols are designed to to
lerate, while automatically verifying that the required safety and timeline
ss properties hold at all times during test experiments. Within this framew
ork, the human tester can define failure operations that simulate every fai
lure class studied in the literature. We apply our approach to two fault-to
lerant protocols typical in embedded systems. The results show that CESIUM
can pinpoint implementation errors that would be very difficult to identify
in a real system, and can also compute accurate performance predictions th
at would be problematic to measure in the real embedded platform without ad
hoc hardware instrumentation.