Engineered bioreactors are useful tools for degrading wastes from crude oil
refining facilities. One such bioreactor forms part of the wastewater reme
diation process used at a refinery in the San Francisco Bay Area. The flow
rate and chemical concentrations of the waste vary, and it is necessary to
be able to predict the efficiency of the reactor degradation process for th
is varied input. The complex biological, physical, acid chemical processes
of the reactor make deterministic modeling unsuitable. Therefore, predictiv
e modeling for this system was performed using a neural network model. A pr
edictive, time-series neural network model requires a complete data set. Of
ten, in the case of a large industrial facility, data are missing. Various
techniques can be used to reconstruct missing data, but comparisons of tech
niques have not been performed for large-scale remediation processes. In th
is manuscript, four techniques are used for reconstructing missing data to
examine which ones provide superior predictive capabilities. It was found t
hat the interpolated and moving average values methods provided the best pr
edictions. The mean and median replacement methods, commonly used in neural
network modeling, provided much poorer predictions. Another goal of this s
tudy is to determine which water quality parameters are more accurately pre
dicted than others. In this study, pH was the most accurately predicted, wh
ile ammonia and total phenolics concentrations were the least accurately pr
edicted.