The most important parameter of a histogram is the bin width because i
t controls the tradeoff between presenting a picture with too much det
ail (''undersmoothing'') or too little detail (''oversmoothing'') with
respect to the true distribution. Despite this importance there has b
een surprisingly little research into estimation of the ''optimal'' bi
n width. Default bin widths in most common statistical packages are, a
t least for large samples, quite far from the optimal bin width. Rules
proposed by, for example, Scott lead to better large sample performan
ce of the histogram, but are not consistent themselves. In this paper
we extend the bin width rules of Scott to those that achieve root-n ra
tes of convergence to the L(2)-optimal bin width, thereby providing fi
rm scientific justification for their use. Moreover, the proposed rule
s are simple, easy and fast to compute, and perform well in simulation
s.