Arrays are a common and important class of data in many applications. Array
s can model data such as digital images, digital video, scientific and expe
rimental data, matrices, and finite element grids. Although array manipulat
ions are diverse and domain-specific, they often exhibit structural regular
ities. This paper describes an algorithm called sub-pushdown to trace data
lineage in such array computations. Lineage tracing is a type of data-flow
analysis that relates parts of a result array to those parts of the argumen
t (base) arrays that have bearings on the result array parts. Sub-pushdown
can be used to trace data lineage in array-manipulating computations expres
sed in the Array Manipulation Language (AML) that was introduced previously
. Sub-pushdown has several useful features. First, the lineage computation
is expressed as an AML query. Second, it is not necessary to evaluate the A
ML lineage query to compute the array data lineage. Third, sub-pushdown nev
er gives false-negative answers. Sub-pushdown has been implemented as part
of the ArrayDB prototype array database system that we have built.