METICULOUS: An FPGA-based Main Memory Emulator for System Software Studies
Due to the scaling problem of the DRAM technology, non-volatile memory devices, which are based on different principle of operation than DRAM, are now being intensively developed to expand the main memory of computers. Disaggregated memory is also drawing attention as an emerging technology to scale...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
07.09.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Due to the scaling problem of the DRAM technology, non-volatile memory
devices, which are based on different principle of operation than DRAM, are now
being intensively developed to expand the main memory of computers.
Disaggregated memory is also drawing attention as an emerging technology to
scale up the main memory. Although system software studies need to discuss
management mechanisms for the new main memory designs incorporating such
emerging memory systems, there are no feasible memory emulation mechanisms that
efficiently work for large-scale, privileged programs such as operating systems
and hypervisors. In this paper, we propose an FPGA-based main memory emulator
for system software studies on new main memory systems. It can emulate the main
memory incorporating multiple memory regions with different performance
characteristics. For the address region of each memory device, it emulates the
latencies, bandwidths and bit-flip error rates of read/write operations,
respectively. The emulator is implemented at the hardware module of an
off-the-self FPGA System-on-Chip board. Any privileged/unprivileged software
programs running on its powerful 64-bit CPU cores can access emulated main
memory devices at a practical speed through the exactly same interface as
normal DRAM main memory. We confirmed that the emulator transparently worked
for CPU cores and successfully changed the performance of a memory region
according to given emulation parameters; for example, the latencies measured by
CPU cores were exactly proportional to the latencies inserted by the emulator,
involving the minimum overhead of approximately 240 ns. As a preliminary use
case, we confirmed that the emulator allows us to change the bandwidth limit
and the inserted latency individually for unmodified software programs, making
discussions on latency sensitivity much easier. |
---|---|
DOI: | 10.48550/arxiv.2309.06565 |