论文标题

基于稀释流量摘要的模拟数据包过渡的可能性推断

Likelihood-based inference for modelling packet transit from thinned flow summaries

论文作者

Rahman, Prosha A., Beranger, Boris, Roughan, Matthew, Sisson, Scott A.

论文摘要

网络交通速度和量的实质增长给网络数据分析带来了实际挑战。数据包变薄和流量聚合协议(例如NetFlow)通过提供结构化数据摘要来减少数据集的大小,但相反,这会阻碍统计推断。旨在建模流量传播模式的方法通常不会将数据包稀疏和摘要过程解释为分析,并且通常很简单,例如〜主义方法。结果,它们的实际用途可能有限。 我们介绍了基于似然的分析,该分析将数据包变薄和NetFlow摘要完全融合到分析中。结果,可以针对单个数据包级别的模型做出推论,而仅观察流量汇总信息。我们建立了最大似然估计器的一致性,该估计量的范围是应观察到的流量量,以达到所需的估计器准确性水平,并确定理想的模型家族。通过模拟分析和在1分钟内包含36M数据包的公开跟踪数据集上进行估计器的稳健性能。

The substantial growth of network traffic speed and volume presents practical challenges to network data analysis. Packet thinning and flow aggregation protocols such as NetFlow reduce the size of datasets by providing structured data summaries, but conversely this impedes statistical inference. Methods which aim to model patterns of traffic propagation typically do not account for the packet thinning and summarisation process into the analysis, and are often simplistic, e.g.~method-of-moments. As a result, they can be of limited practical use. We introduce a likelihood-based analysis which fully incorporates packet thinning and NetFlow summarisation into the analysis. As a result, inferences can be made for models on the level of individual packets while only observing thinned flow summary information. We establish consistency of the resulting maximum likelihood estimator, derive bounds on the volume of traffic which should be observed to achieve required levels of estimator accuracy, and identify an ideal family of models. The robust performance of the estimator is examined through simulated analyses and an application on a publicly available trace dataset containing over 36m packets over a 1 minute period.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源