论文标题

CLR-DRAM:一种低成本的DRAM体系结构,实现动态能力延迟权衡

CLR-DRAM: A Low-Cost DRAM Architecture Enabling Dynamic Capacity-Latency Trade-Off

论文作者

Luo, Haocong, Shahroodi, Taha, Hassan, Hasan, Patel, Minesh, Yaglikci, Abdullah Giray, Orosa, Lois, Park, Jisung, Mutlu, Onur

论文摘要

DRAM是普遍存在的主要内存技术,但其较长的访问延迟可以限制许多工作负载的性能。尽管先前的作品提供了DRAM设计,从而减少了DRAM访问潜伏期,但其降低的存储能力阻碍了需要大量内存能力的工作负载的性能。由于容量延迟的权衡是在设计时间固定的,因此以前的作品在非常不同且动态的工作量需求下无法实现最高的性能。 本文提出了容量延迟可追溯的DRAM(CLR-DRAM),这是一种新的DRAM体系结构,可以以低成本实现动态的容量延迟权衡。 CLR-DRAM允许任何DRAM行的动态重新配置在两种操作模式之间切换:1)最大容量模式,每个DRAM单元单独运行,以达到与密度相机的商品DRAM芯片和2)高度的较高的单个单元格和隔离的单个型单元格,以达到与密度相机的大约dram芯片和2)cou a and and and a andpled a andpled andpled and cou a cou a cou a cououte andpled and cou a cououtle and cououte and cououter and cououter and cououter and cououter and cououter and cououter and cououter的运行模式大约相同的存储密度。单个逻辑含义放大器。 我们通过在每个DRAM子阵列中添加隔离晶体管来实现CLR-DRAM。我们的评估表明,通过四核多编程工作负载,CLR-DRAM可以平均将系统性能和DRAM能耗提高18.6%和29.7%。我们认为,CLR-DRAM为系统打开了新的研究方向,以适应工作负载的多样化和动态变化的内存能力和访问延迟需求。

DRAM is the prevalent main memory technology, but its long access latency can limit the performance of many workloads. Although prior works provide DRAM designs that reduce DRAM access latency, their reduced storage capacities hinder the performance of workloads that need large memory capacity. Because the capacity-latency trade-off is fixed at design time, previous works cannot achieve maximum performance under very different and dynamic workload demands. This paper proposes Capacity-Latency-Reconfigurable DRAM (CLR-DRAM), a new DRAM architecture that enables dynamic capacity-latency trade-off at low cost. CLR-DRAM allows dynamic reconfiguration of any DRAM row to switch between two operating modes: 1) max-capacity mode, where every DRAM cell operates individually to achieve approximately the same storage density as a density-optimized commodity DRAM chip and 2) high-performance mode, where two adjacent DRAM cells in a DRAM row and their sense amplifiers are coupled to operate as a single low-latency logical cell driven by a single logical sense amplifier. We implement CLR-DRAM by adding isolation transistors in each DRAM subarray. Our evaluations show that CLR-DRAM can improve system performance and DRAM energy consumption by 18.6% and 29.7% on average with four-core multiprogrammed workloads. We believe that CLR-DRAM opens new research directions for a system to adapt to the diverse and dynamically changing memory capacity and access latency demands of workloads.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源