The rapid growth of the Internet and support for interoperability protocols
has increased the number of Web accessible sources, WebSources. Current wr
apper mediator architectures need to be extended with a wrapper cost model
(WCM) for WebSources that can estimate the response time (delays) to access
sources as well as other relevant statistics. In this paper, we present a
Web prediction tool (WebPT), a tool that is based on learning using query f
eedback from WebSources. The WebPT uses dimensions time of day, day, and qu
antity of data, to learn response times from a particular WebSource, and to
predict the expected response time (delay) for some query. Experiment data
was collected from several sources, and those dimensions that were signifi
cant in estimating the response time were determined. We then trained the W
ebPT on the collected data, to use the three dimensions mentioned above, an
d to predict the response time, as well as a confidence in the prediction.
We describe the WebPT learning algorithms, and report on the WebPT learning
for WebSources. Our research shows that we can improve the quality of lear
ning by tuning the WebPT features, e.g., training the WebPT using a logarit
hm of the input training data; including significant dimensions in the WebP
T; or changing the ordering of dimensions. A comparison of the WebPT with m
ore traditional neural network (NN) learning has been performed, and we bri
efly report on the comparison. We then demonstrate how the WebPT prediction
of delay may be used by a scrambling enabled optimizer. A scrambling algor
ithm identifies some critical points of delay, where it makes a decision to
scramble (modify) a plan, to attempt to hide the expected delay by computi
ng some other part of the plan that is unaffected by the delay. We explore
the space of real delay at a WebSource, versus the WebPT prediction of this
delay, with respect to critical points of delay in specific plans. We iden
tify those cases where WebPT overestimation or underestimation of the real
delay results in a penalty in the scrambling enabled optimizer, and those c
ases where there is no penalty. Using the experimental data and WebPT learn
ing, we test how good the WebPT is in minimizing these penalties.