Fault tolerance is an important design criterion for reliable and robust vi
deo-on-demand systems. Conventional fault-tolerant designs use either a pri
mary backup or an active replication method to provide system fault toleran
ce. However, these approaches suffer from low utilization of the backup or
replication system. In this paper we propose two playback-recovery schemes
for distributed video-on-demand systems called the forward playback-recover
y scheme and the backward playback-recovery scheme. Unlike conventional fau
lt-tolerant designs, our schemes use existing playback resources to recover
faulty playbacks without allocating new resources, significantly reducing
recovery overhead. To use the schemes effectively, we developed a distribut
ed algorithm for determining the order and gap information between the play
backs on the distributed video-on-demand servers so that overhead for recov
ering from a server failure can be minimized. This algorithm achieves N - 1
fault-tolerant resiliency for N-server video-on-demand systems. In additio
n, three server-recovery policies are also presented to guide surviving ser
vers in applying the proper scheme to recover faulty playbacks, thus reduci
ng overall recovery costs. Simulation results show that the proposed recove
ry schemes are effective and useful in designing fault-tolerant multiple-se
rver video-on-demand systems.