ITA
ENG

EFFICIENT RANDOMIZED ALGORITHMS FOR THE REPEATED MEDIAN LINE ESTIMATOR

Authors

MATOUSEK J MOUNT DM NETANYAHU NS

Citation

J. Matousek et al., EFFICIENT RANDOMIZED ALGORITHMS FOR THE REPEATED MEDIAN LINE ESTIMATOR, Algorithmica, 20(2), 1998, pp. 136-150

Citations number

Categorie Soggetti

Computer Sciences",Mathematics,Mathematics,"Computer Science Software Graphycs Programming

Journal title

Algorithmica → ACNP

ISSN journal

01784617

Volume

Issue

Year of publication

1998

Pages

136 - 150

Database

ISI

SICI code

0178-4617(1998)20:2<136:ERAFTR>2.0.ZU;2-S

Abstract

The problem of fitting a straight line to a finite collection of point s in the plane is an important problem in statistical estimation. Rece ntly there has been a great deal of interest is robust estimators, bec ause of their lack of sensitivity to outlying data points. The basic m easure of the robustness of an estimator is its breakdown point, that is, the fraction (up to 50%) of outlying data points that can corrupt the estimator, One problem with robust estimators is that achieving hi gh breakdown points (near 50%) has proved to be computationally demand ing. In this paper we present the best known theoretical algorithm and a practical subquadratic algorithm for computing a 50% breakdown poin t line estimator, the Siegel or repeated median line estimator. We fir st present an O(n log n) randomized expected-time algorithm, where la is the number of given points. This algorithm relies, however, on soph isticated data structures. We also present a very simple O(n log(2) n) randomized algorithm for this problem, which uses no complex data str uctures. We provide empirical evidence that, for many realistic input distributions, the running time of this second algorithm is actually O (n log n) expected time.