On some techniques for streaming data: a case study of internet packet headers

Citation
J. Wegman, Edward et J. Marchette, David, On some techniques for streaming data: a case study of internet packet headers, Journal of computational and graphical statistics , 12(4), 2003, pp. 893-914
ISSN journal
10618600
Volume
12
Issue
4
Year of publication
2003
Pages
893 - 914
Database
ACNP
SICI code
Abstract
We consider the implications of streaming data for data analysis and data mining. Streaming data are becoming widely available from a variety of sources. In our case we consider the implications arising from Internet traffic data. By implication, streaming data are unlikely to be time homogeneous so that standard statistical and data mining procedures do not necessarily apply. Because it is essentially impossible to store streaming data, we consider recursive algorithms, algorithms which are adaptive and discount the past and also algorithms that create finite pseudo-samples. We also suggest some evolutionary graphics procedures that are suitable for streaming data. We begin our discussion with a discussion of Internet traffic in order to give the reader some sense of the time and data scale and visual resolution needed for such problems.