Jiffy: Elastic Far-Memory for Stateful Serverless Analytics

Abstract

Stateful serverless analytics can be enabled using a remote memory system for inter-task communication, and for storing and exchanging intermediate data. However, existing systems allocate memory resources at job granularity—jobs specify their memory demands at the time of the submission; and, the system allocates memory equal to the job’s demand for the entirety of its lifetime. This leads to resource underutilization and/or performance degradation when intermediate data sizes vary during job execution.This paper presents Jiffy, an elastic far-memory system for stateful serverless analytics that meets the instantaneous memory demand of a job at seconds timescales. Jiffy efficiently multiplexes memory capacity across concurrently running jobs, reducing the overheads of reads and writes to slower persistent storage, resulting in 1.6 – 2.5× improvements in job execution time over production workloads. Jiffy implementation currently runs on Amazon EC2, enables a wide variety of distributed programming models including MapReduce, Dryad, StreamScope, and Piccolo, and natively supports a large class of analytics applications on AWS Lambda.

Publication
Proceedings of the Seventeenth European Conference on Computer Systems(EuroSys)
Yupeng Tang
Yupeng Tang
Second-year PhD student @ Yale University

My research interests include distributed systems, cloud computing and networking.