Analog techniques are desirable for hardware implementation of neural netwo
rks due to their numerous advantages such as small size, low power, and hig
h speed. However, these advantages are often offset by the difficulty in th
e training of analog neural network circuitry. In particular, training of t
he circuitry by software based on hardware models is impaired by statistica
l variations in the integrated circuit production process, resulting in per
formance degradation. In this paper, a new paradigm of noise injection duri
ng training for the reduction of this degradation is presented. The variati
ons at the outputs of analog neural network circuitry are modeled based on
the transistor-level mismatches occurring between identically designed tran
sistors, Those variations are used as additive noise during training to inc
rease the fault tolerance of the trained neural network. The results of thi
s paradigm are confirmed via numerical experiments and physical measurement
s and are shown to be superior to the case of adding random noise during tr
aining.