One of the most fundamental tasks in pattern recognition involves fitting a
curve such as a line segment to a given set of data points. Using the conv
entional ordinary least-squares (OLS) method of fitting a line to a set of
data points is notoriously unreliable when the data contain points coming f
rom two different populations: (i) randomly distributed points ("random noi
se"), (ii) points correlated with the line itself (e.g., obtained by pertur
bing the line with zero-mean Gaussian noise). Points which lie far away fro
m the line (i.e., "outliers") usually belong to the random noise population
; since they contribute the most to the squared distances, they skew the li
ne estimate from its correct position. In this paper we present an analytic
method of separating the components of the mixture. Unlike previous method
s, we derive a closed-form solution. Applying a variant of the method of mo
ments (MoM) to the assumed mixture model yields an analytic estimate of the
desired line. Finally, we provide experimental results obtained by our met
hod. (C) 2001 Pattern Recognition Society. Published by Elsevier Science Lt
d. All rights reserved.