ITA
ENG

DATA-BASED CHOICE OF HISTOGRAM BIN WIDTH

Authors

WAND MP

Citation

Mp. Wand, DATA-BASED CHOICE OF HISTOGRAM BIN WIDTH, The American statistician, 51(1), 1997, pp. 59-64

Citations number

Categorie Soggetti

Statistic & Probability","Statistic & Probability

Journal title

The American statistician → ACNP

ISSN journal

00031305

Volume

Issue

Year of publication

1997

Pages

59 - 64

Database

ISI

SICI code

0003-1305(1997)51:1<59:DCOHBW>2.0.ZU;2-J

Abstract

The most important parameter of a histogram is the bin width because i t controls the tradeoff between presenting a picture with too much det ail (''undersmoothing'') or too little detail (''oversmoothing'') with respect to the true distribution. Despite this importance there has b een surprisingly little research into estimation of the ''optimal'' bi n width. Default bin widths in most common statistical packages are, a t least for large samples, quite far from the optimal bin width. Rules proposed by, for example, Scott lead to better large sample performan ce of the histogram, but are not consistent themselves. In this paper we extend the bin width rules of Scott to those that achieve root-n ra tes of convergence to the L(2)-optimal bin width, thereby providing fi rm scientific justification for their use. Moreover, the proposed rule s are simple, easy and fast to compute, and perform well in simulation s.