Software sensor design consists of building an estimate of some quantity of
interest. This estimate can be used either to replace a physical measureme
nt, or to validate an existing one. This paper provides some general guidel
ines for the design of software sensors based on empirical data. When the m
odel is a priori unknown, the problem can be stated in terms of non-paramet
ric regression or black-box modelling. Complexity control is the main diffi
culty in this setting. A trade-off must be achieved between two antagonist
goals. the model should not be too simple, and model identification should
not be too variable. We propose to address this issue by a penalization alg
orithm, which also estimates the relevance of input features in the identif
ication process. A data-driven software sensor should also provide accuracy
and validity indexes of its prediction. We show how these indexes can be e
stimated for complex non-parametric methods, such as neural networks. An ap
plication in environmental monitoring, the design of an ammonia software se
nsor, illustrates each step of the approach. (C) 1999 Elsevier Science B.V.
All rights reserved.