Wa. Wheaton et al., MULTIPARAMETER LINEAR LEAST-SQUARES FITTING TO POISSON DATA ONE COUNTAT A TIME, The Astrophysical journal, 438(1), 1995, pp. 322-340
A standard problem in gamma-ray astronomy data analysis is the decompo
sition of a set of observed counts, described by Poisson statistics, a
ccording to a given multicomponent linear model, with underlying physi
cal count rates or fluxes which are to be estimated from the data. Des
pite its conceptual simplicity, the linear least-squares (LLSQ) method
for solving this problem has generally been limited to situations in
which the number n(i) of counts in each bin i is not too small, conven
tionally more than 5-30. It seems to be widely believed that the failu
re of the LLSQ method for small counts is due to the failure of the Po
isson distribution to be even approximately normal for small numbers.
The cause is more accurately the strong anticorrelation between the da
ta and the weights w(i) in the weighted LLSQ method when square-rootn(
i) instead of square-rootnBAR(i) is used to approximate the uncertaint
ies, sigma(i), in the data, where nBAR(i) = E[n(i)], the expected valu
e of n(i). We show in an appendix that, avoiding this approximation, t
he correct equations for the Poisson LLSQ (PLLSQ) problem are actually
identical to those for the maximum likelihood estimate using the exac
t Poisson distribution. Since weighted linear least-squares involves a
kind of weighted averaging, LLSQ estimators generally produce biased
results when the data and their weights are correlated. We describe a
class of weighted PLLSQ estimators which are linear functions of the o
bserved counts. Such PLLSQ estimators are unbiased independent of nBAR
(i), even when the average number of counts in an entire fit is much l
ess than one. Their variance is a minimum when the weights are calcula
ted from the true variances of the data, but in general these are not
accurately known. Fortunately, the variance of the estimate is a very
weak function of the weights near the optimum value, so for the PLLSQ
problem it is easy in practice to find weights that are virtually idea
l yet still completely unbiased. PLLSQ estimators which are linear in
the data also allow fitting multiple data sets by the calculation of o
nly a scalar product, without the need to repeat the accumulation and
solution of the LLSQ equations. Owing also to the linearity of the est
imates in the data, each count contributes to the answers independentl
y of every other count, so that the results for small bins are indepen
dent of the particular choice of binning. This property makes possible
PLLSQ methods which avoid binning the data altogether. Some alternati
ves to the approximation of the uncertainties in the data by the squar
e root of the observed counts are discussed. We apply the method to so
lve a problem in high-resolution gamma-ray spectroscopy for the JPL Hi
gh-Resolution Gamma-Ray Spectrometer flown on HEAO 3. Systematic error
in subtracting the strong, highly variable background encountered in
the low-energy gamma-ray region can be significantly reduced by closel
y pairing source and background data in short segments. Significant re
sults can be built up by weighted averaging of the net fluxes obtained
from the subtraction of many individual source/background pairs. Exte
nsion of the approach to complex situations, with multiple cosmic sour
ces and realistic background parameterizations, requires a means of ef
ficiently fitting to data from single scans in the narrow (almost-equa
l-to 1.2 keV, for HEAO 3) energy channels of a Ge spectrometer, where
the expected number of counts obtained per scan may be very low. Such
an analysis system is discussed and compared to the method previously
used.