The importance of fitting distributions to data for risk analysis continues
to grow as regulatory agencies, like the Environmental Protection Agency (
EPA), continue to shift from deterministic to probabilistic risk assessment
techniques. The use of Monte Carlo simulation as a tool for propagating va
riability and uncertainty in risk requires specification of the risk model'
s inputs in the form of distributions or tables of data. Several software t
ools exist to support risk assessors in their efforts to develop distributi
ons. However, users must keep in mind that these tools do not replace clear
thought about judgments that must be made in characterizing the informatio
n from data. This overview introduces risk assessors to the statistical con
cepts and physical reasons that support important judgments about appropria
te types of parametric distributions and goodness-of-fit. In the context of
using data to improve risk assessment and ultimately risk management, this
paper discusses issues related to the nature of the data (representativene
ss, quantity, and quality, cell-elation with space and time, and distinguis
hing between variability and uncertainty for a set of data), and matching d
ata and distributions appropriately. All data analysis (whether "Frequentis
t" or "Bayesian" or oblivious to the distinction) requires the use of subje
ctive judgment. The paper offers an iterative process for developing distri
butions using data to characterize variability and uncertainty for inputs t
o risk models that provides incentives for collecting better information wh
en the value of information exceeds its cost. Risk analysts need to focus a
ttention on characterizing the information appropriately for purposes of th
e risk assessment (and risk management questions at hand), not on character
ization for its own sake.