Decision support systems form the core of business IT infrastructures becau
se they let companies translate business information into tangible and lucr
ative results. Collecting, maintaining, and analyzing large amounts of data
, however, involves expensive technical challenges that require organizatio
nal commitment.
Many commercial tools are available for each of the three major data wareho
using tasks: populating the data warehouse from independent operational dat
abases, storing and managing the data, and analyzing the data to make intel
ligent business decisions.
Data cleaning relates to heterogeneous data integration, a problem studied
for many years. More work must be done to develop domain-independent tools
that solve the data cleaning problems associated with data warehouse develo
pment.
Most data mining research has focused on developing algorithms for building
more accurate models or building models faster. However, data preparation
and mining model deployment present several engaging problems that relate s
pecifically to achieving better synergy between database systems and data m
ining technology.