G. Levy et al., Statistical, physical, and computational aspects of massive data analysis and assimilation in atmospheric applications, J COMPU G S, 8(3), 1999, pp. 559-574
The goals and procedures of the most data-intensive operations in atmospher
ic sciences, including data assimilation and fusion, are introduced. We exp
lore specific problems that result from the expansion in observing systems
from conventional to satellite borne and the corresponding transition from
small, medium, and large datasets to massive datasets. The satellite data,
their volumes, heterogeneity, and structure are described in specific examp
les. We illustrate that the atmospheric data analysis and assimilation proc
edures and the satellite data pose unique problems that do not exist in oth
er applications and are not easily addressed by existing methods and tools.
Existing solutions are presented and their performance with massive datase
ts is critically evaluated. We conclude that since the problems are interdi
sciplinary, a comprehensive solution must be interdisciplinary as well. We
note that components of such a solution already exist in statistics, atmosp
heric, and computational sciences, but that in isolation they often fail to
scale up to the massive data challenge. The prospects of synthesizing an i
nterdisciplinary solution which will scale up to the massive data challenge
are thus promising.