We consider the problem of monitoring an interactive device, such as an imp
lementation of a network protocol, in order to check whether its execution
is consistent with its specification. At first glance, it appears that a mo
nitor could simply follow the input-output trace of the device and check it
against the specification. However, if the monitor is able to observe inpu
ts and outputs only from a vantage point external to the device-as is typic
ally the case-the problem becomes surprisingly difficult. This is because e
vents may be buffered, and even lost, between the monitor and the device, i
n which case, even for a correctly running device, the trace observed at th
e monitor could be inconsistent with the specification.
In this paper, we formulate the problem of external monitoring as a languag
e recognition problem. Given a specification that accepts a certain languag
e of input-output sequences, we define another language that corresponds to
input-output sequences observable externally. We also give an algorithm to
check membership of a string in the derived language. It turns out that wi
thout any assumptions on the specification, this algorithm may take unbound
ed time and space. To address this problem, we define a series of propertie
s of device specifications or protocols that can be exploited to construct
efficient language recognizers at the monitor. We characterize these proper
ties and provide complexity bounds for monitoring in each case.
To illustrate our methodology we describe properties of the Internet Transm
ission Control Protocol (TCP), and identify features of the protocol that m
ake it challenging to monitor efficiently.