Motivated by the needs of on-line optimization of real-world engineering sy
stems, we studied single sample path-based algorithms for Markov decision p
roblems (MDP). The sample path used in the algorithms can be obtained by ob
serving the operation of a real system. We give a simple example to explain
the advantages of the sample path-based approach over the traditional comp
utation-based approach: matrix inversion is not required; some transition p
robabilities do not have to be known; it may save storage space; and it giv
es the flexibility of iterating the actions for a subset of the state space
in each iteration. The effect of the estimation errors and the convergence
property of the sample path-based approach are studied. Finally, we propos
e a fast algorithm, which updates the policy whenever the system reaches a
particular set of states and prove that the algorithm converges to the true
optimal policy with probability one under some conditions. The sample path
-based approach may have important applications to the design and managemen
t of engineering systems, such as high speed communication networks.