Limiting Lamport Exposure to Distant Failures in Globally-Managed Distributed Systems

Globalized computing infrastructures offer the convenience and elasticity of globally managed objects and services, but lack the resilience to distant failures that localized infrastructures such as private clouds provide. Providing both global management and resilience to distant failures, however,...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Băsescu, Cristina, Fragkouli, Georgia, Enis, Ceyhun Alp, Nowlan, Michael F, Faleiro, Jose M, Bosson, Gaylor, Cong, Kelong, Borsò-Tan, Pierluca, Estrada-Galiñanes, Vero, d, Bryan
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 15.07.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Globalized computing infrastructures offer the convenience and elasticity of globally managed objects and services, but lack the resilience to distant failures that localized infrastructures such as private clouds provide. Providing both global management and resilience to distant failures, however, poses a fundamental problem for configuration services: How to discover a possibly migratory, strongly-consistent service/object in a globalized infrastructure without dependencies on globalized state? Limix is the first metadata configuration service that addresses this problem. With Limix, global strongly-consistent data-plane services and objects are insulated from remote gray failures by ensuring that the definitive, strongly-consistent metadata for any object is always confined to the same region as the object itself. Limix guarantees availability bounds: any user can continue accessing any strongly consistent object that matters to the user located at distance \(\Delta\) away, insulated from failures outside a small multiple of \(\Delta\). We built a Limix metadata service based on CockroachDB. Our experiments on Internet-like networks and on AWS, using realistic trace-driven workloads, show that Limix enables global management and significantly improves availability over the state-of-the-art.
ISSN:2331-8422