论文标题
关于基于PEB的在线内存访问跟踪的适用性,用于非均质内存管理
On the Applicability of PEBS based Online Memory Access Tracking for Heterogeneous Memory Management at Scale
论文作者
论文摘要
历史上,操作系统只需要管理一种类型的存储设备。基于新兴记忆技术的异质存储设备的迫在眉睫的可用性面临经典的单存储模型,并为内存管理开辟了新的可能性。基于应用程序访问模式之间的不同内存设备之间的透明数据移动是最佳使用此类设备并将内存管理复杂性隐藏到最终用户的复杂性。但是,在运行时捕获应用程序的内存访问模式是有代价的,这对于可能对系统噪声敏感的大型平行应用程序尤其具有挑战性。 在这项工作中,我们将重点放在实际内存重新定位之前的访问模式分析阶段。我们研究使用基于英特尔的处理器事件采样(PEB)功能来记录内存访问的可行性,通过在运行时进行采样并大规模研究开销。我们已经在IHK/McKernel轻量级多内核操作系统中实现了一个自定义的Pebs驱动程序,与其他OS内核相比,由于轻量级内核的简单设计,其优势是最小的系统干扰。我们介绍了一组科学应用的PEB开销,并显示了对噪声敏感的HPC应用中确定的访问模式。我们的结果表明,在最差的案例中,可以用10%的开销来捕获清晰的访问模式,而在最高128K CPU内核(2,048英特尔Xeon Phi phi骑士降落节点)时,可以捕获1%。我们得出的结论是,使用PEB的在线内存访问分析大规模有望在异构内存环境中用于内存管理。
Operating systems have historically had to manage only a single type of memory device. The imminent availability of heterogeneous memory devices based on emerging memory technologies confronts the classic single memory model and opens a new spectrum of possibilities for memory management. Transparent data movement between different memory devices based on access patterns of applications is a desired feature to make optimal use of such devices and to hide the complexity of memory management to the end-user. However, capturing memory access patterns of an application at runtime comes at a cost, which is particularly challenging for large scale parallel applications that may be sensitive to system noise. In this work, we focus on the access pattern profiling phase prior to the actual memory relocation. We study the feasibility of using Intel's Processor Event-Based Sampling (PEBS) feature to record memory accesses by sampling at runtime and study the overhead at scale. We have implemented a custom PEBS driver in the IHK/McKernel lightweight multi-kernel operating system, one of whose advantages is minimal system interference due to the lightweight kernel's simple design compared to other OS kernels such as Linux. We present the PEBS overhead of a set of scientific applications and show the access patterns identified in noise-sensitive HPC applications. Our results show that clear access patterns can be captured with a 10% overhead in the worst-case and 1% in the best case when running on up to 128k CPU cores (2,048 Intel Xeon Phi Knights Landing nodes). We conclude that online memory access profiling using PEBS at large scale is promising for memory management in heterogeneous memory environments.