论文标题
FlowDNS:将NetFlow和DNS流相关联
FlowDNS: Correlating Netflow and DNS Streams at Scale
论文作者
论文摘要
了解客户的兴趣,例如他们正在使用的视频点播(VOD)或社交网络服务,有助于具有更好网络计划的电信公司,以精确增强客户兴趣所在的效果,并为客户提供相关的商业套餐。但是,随着通过不同的服务,识别和流量对网络层信息的归因的越来越多的部署,将成为一个挑战:如果多个服务使用相同的CDN提供商,则不能仅根据IP前缀来轻松区分它们。因此,对于流量归因,超越纯网络层信息至关重要。在这项工作中,我们利用客户默认DNS解析器收集的实时DNS响应。拥有这些DNS响应并将它们与网络层标头相关联,我们能够将CDN托管域转换为它们所属的实际服务。我们为此目的设计了一个相关系统,并将其部署在欧洲大型ISP中。有了我们的系统,我们可以将流量的平均81.7%与相应的服务相关联,而我们的实时数据流却没有任何损失。我们的相关结果还表明,每日流量中有0.5%包含畸形,垃圾邮件或网络钓鱼域名。此外,ISP可以将结果与其BGP信息相关联,以查找有关流量原点和目的地的更多详细信息。我们计划发布我们的关联软件,供其他研究人员或网络运营商使用。
Knowing customer's interests, e.g. which Video-On-Demand (VoD) or Social Network services they are using, helps telecommunication companies with better network planning to enhance the performance exactly where the customer's interests lie, and also offer the customers relevant commercial packages. However, with the increasing deployment of CDNs by different services, identification, and attribution of the traffic on network-layer information alone becomes a challenge: If multiple services are using the same CDN provider, they cannot be easily distinguished based on IP prefixes alone. Therefore, it is crucial to go beyond pure network-layer information for traffic attribution. In this work, we leverage real-time DNS responses gathered by the clients' default DNS resolvers. Having these DNS responses and correlating them with network-layer headers, we are able to translate CDN-hosted domains to the actual services they belong to. We design a correlation system for this purpose and deploy it at a large European ISP. With our system, we can correlate an average of 81.7% of the traffic with the corresponding services, without any loss on our live data streams. Our correlation results also show that 0.5% of the daily traffic contains malformatted, spamming, or phishing domain names. Moreover, ISPs can correlate the results with their BGP information to find more details about the origin and destination of the traffic. We plan to publish our correlation software for other researchers or network operators to use.