S. Zhang et al., Examination of numerical results from tangent linear and adjoint of discontinuous nonlinear models, M WEATH REV, 129(11), 2001, pp. 2791-2804
The forward model solution and its functional (e.g., the cost function in 4
DVAR) are discontinuous with respect to the model's control variables if th
e model contains discontinuous physical processes that occur during the ass
imilation window. In such a case, the tangent linear model (the first-order
approximation of a finite perturbation) is unable to represent the sharp j
umps of the nonlinear model solution. Also, the first-order approximation p
rovided by the adjoint model is unable to represent a finite perturbation o
f the cost function when the introduced perturbation in the control variabl
es crosses discontinuous points. Using an idealized simple model and the Ar
akawa-Schubert cumulus parameterization scheme, the authors examined the be
havior of a cost function and its gradient obtained by the adjoint model wi
th discontinuous model physics. Numerical results show that a cost function
involving discontinuous physical processes is zeroth-order discontinuous,
but piecewise differentiable. The maximum possible number of involved disco
ntinuity points of a cost function increases exponentially as 2(kn), where
k is the total number of thresholds associated with on-off switches, and n
is the total number of time steps in the assimilation window. A backward ad
joint model integration with the proper forcings added at various time step
s, similar to the backward adjoint model integration that provides the grad
ient of the cost function at a continuous point, produces a one-sided gradi
ent (called a subgradient and denoted as del (s)J) at a discontinuous point
. An accuracy check of the gradient shows that the adjoint-calculated gradi
ent is computed exactly on either side of a discontinuous surface. While a
cost function evaluated using a small interval in the control variable spac
e oscillates, the distribution of the gradient calculated at the same resol
ution not only shows a rather smooth variation, but also is consistent with
the general convexity of the original cost function. The gradients of disc
ontinuous cost functions are observed roughly smooth since the adjoint inte
gration correctly computes the one-sided gradient at either side of discont
inuous surface. This implies that, although (del (s)J)(T)deltax may not app
roximate deltaJ = J(x + dx) - J(x) well near the discontinuous surface, the
subgradient calculated by the adjoint of discontinuous physics may still p
rovide useful information for finding the search directions in a minimizati
on procedure. While not eliminating the possible need for the use of a nond
ifferentiable optimization algorithm for 4DVAR with discontinuous physics,
consistency between the computed gradient by adjoints and the convexity of
the cost function may explain why a differentiable limited-memory quasi-New
ton algorithm still worked well in many 4DVAR experiments that use a diabat
ic assimilation model with discontinuous physics.