We use partial likelihood (PL) theory to introduce a general probabilistic
framework for the design and analysis of neural classifiers. The formulatio
n allows for the training samples used in the design to have correlations i
n time, and for use of a wide range of neural network probability models in
cluding recurrent structures. We use PL theory to establish a fundamental i
nformation-theoretic connection, show the equivalence of likelihood maximiz
ation and relative entropy minimization, without making the common assumpti
ons of independent training samples and true distribution information. We u
se this result to construct the information geometry of partial likelihood
and derive the information geometric e- and m-projection (em) algorithm for
class conditional density modeling by finite normal mixtures. We demonstra
te the successful application of the algorithm by a channel equalization ex
ample and give simulation results to show the efficiency of the scheme.