TMC: Near-Optimal Resource Allocation for Tiered-Memory Systems

Appeared in Symposium on Cloud Computing (SoCC).

Abstract

Main memory dominates data center server cost, and hence data center operators are exploring alternative technologies such as CXL-attached and persistent memory to improve cost without jeopardizing performance. Introducing multiple tiers of memory introduces new challenges, such as selecting the appropriate memory configuration for a given workload mix. In particular, we observe that inefficient configurations increase cost by up to 2.6× for clients, and resource stranding increases cost by 2.2× for cloud operators. To address this challenge, we introduce TMC, a system for recommending cloud configurations according to workload characteristics and the dynamic resource utilization of a cluster. Whereas prior work utilized extensive simulation or costly machine learning techniques, incurring significant search costs, our approach profiles applications to reveal internal properties that lead to fast and accurate performance estimations. Our novel configuration-selection algorithm incorporates a new heuristic, packing penalty, to ensure that recommended configurations will also achieve good resource efficiency. Our experiments demonstrate that TMC reduces the search cost by up to 4× over the state-of-the-art while improving resource utilization by up to 17% as compared to a naive policy that requests optimal tiered memory allocations in isolation.

Publication date:
November 2023

Authors:
Yuanjiang Ni
Pankaj Mehra
Ethan L. Miller
Heiner Litz

Projects:
Storage Class Memories
CXL SIG (Disaggregated Memory)
Prediction and Grouping
Adaptive Caching

Available media

Full paper text: PDF

Bibtex entry

@inproceedings{TMC,
  author       = {Yuanjiang Ni and Pankaj Mehra and Ethan L. Miller and Heiner Litz},
  title        = {{TMC}: Near-Optimal Resource Allocation for Tiered-Memory Systems},
  booktitle    = {Symposium on Cloud Computing (SoCC)},
  month        = nov,
  year         = {2023},
}
Last modified 1 Mar 2024