DOE-NET - Optimal design of performance measurement experiments for complex, large-scale networks

Network measurement may have social, engineering or commercial motivations. Typical social motivations include the need to understand popular usage and to answer questions about social-networking sites; other possibilities include use as part of legal evidence to illustrate the (lack of) impact of regulation upon the popularity of peer-2-peer networks. The engineering application of measurement is an integral part of the network optimisation process / both to provide a baseline of pre-optimised performance and to quantify improvements. Another area of application is as part of measurement-based algorithms for controlling network access. Commercially, measurement is critical to guaranteeing Service Level Agreements (SLAs) between network and service providers and customers. These SLAs provide enforceable guarantees on the upper bounds of packet level performance; e.g. they state that mean delay, delay variation (sometimes called "jitter") and packet loss probability will not exceed a specified value when measured over an agreed period.

However a critical problem is that network traffic and topologies are highly variable and this tends to create difficulties in measuring accurately. For this reason measurement can be prone to very large errors in estimating end-to-end delay (both mean delay and jitter) and packet loss rates. Indeed the optimal measurement of packet level performance is a challenging open problem in engineering mathematics.

Current measurement methods are not actually designed to provide the maximum information from the minimum data set. In this project the crucial step is to view all network measurements as numerical experiments, in which random processes are sampled, and the sampling is constrained by the resources available, e.g. bandwidth. In this way we are then able to apply the Statistical Design of Experiments (DOE) to network measurement experiments.

DOE techniques have been very successfully applied in linear and static environments, mainly in biological and some industrial contexts. Most work on DOE has assumed static processes, or deals only with the static aspects of the processes, but network traffic and topologies are highly variable and nonlinear. The first work on DOE for models which are solutions of nonlinear differential equations was in the field of chemical kinetics, co-authored by a member of the Statistics Group at Queen Mary. Subsequent work at Queen Mary has further developed DOE for nonlinear models, or nonlinear functions of the parameters in a linear model. Although the models involved are fairly small-scale compared with those arising in networks, having a single input variable and no feedback, this has provided a good starting point.

In parallel with the DOE thread at Queen Mary (Statistics Group and Networks Group), the University of Cambridge (Computer Lab) builds on experience in techniques for packet (or packet flow) classification. Such techniques for the classification of network traffic have previously used features derived from streams of packets. Such feature collections are often huge (200+), and can range in complexity from Fourier-Transformations and quartile statistics to mean and variance of packet inter-arrival times and the number of TCP SACK packets. Classification accuracy is often good, but with the disadvantage of complexity and cost. In this project such previous experience with lightweight application classification schemes are re-oriented towards learning the traffic characteristics that are critical in their influence on delay and loss performance. This approach to focus on actual experimental observations; a better approach than relying on simple queue models that have inbuilt (and limiting) distributions chosen mainly to allow the resulting system of equations to be solved.

The combination of DOE and machine learning promises a real step towards solving the problem of the optimal measurement of packet level performance.