Posts

Showing posts from April, 2018

tldr: Chord

Image
Part of the tldr series , which summarizes papers I'm currently reading. Inspired by the morning paper Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications [2001] Stoica et. al Introduction The motivation for this paper is distributed peer-to-peer applications, which can't efficiently locate the node that stores a particular data item. Specifically, Chord supports 1 operation: given a key, it maps the key to a node.  It uses consistent hashing to balance load, and Chord nodes only needs routing information for a few other nodes.  Motivation & Context Chord is a library linked to client and server applications that: Provides a function, lookup(key) --> IP address of node with key Notifies application on node of changes in its set of keys  The above figure shows an example of an application using Chord. Chord addresses the following issues: Load balance Chord is basically a distributed hash function that evenly spr

tldr: Bayou

Image
Part of the tldr series , which summarizes papers I'm currently reading. Inspired by the morning paper . Bayou, a Weakly Connected Replicated Storage System [1995] Terry et. al Introduction Bayou is motivated by providing a storage system for applications that are: Collaborative (conflicting, concurrent activity) Weakly connected (mobile computers with poor network connectivity) Non real-time  For example, it can support shared calendars.  In fact, Bayou only requires occasional, pairwise communication between computers. Furthermore, it follows a weakly consistent, replicated data model . Finally, it implements domain-specific, automatic conflict resolution . This allows the system to remain available.  System Overview  Applications (represented in the above figure as clients) interact with servers through the Bayou API. Although clients interact with a single server at a time, they can switch between interacting with different servers. Each server h

tldr: Memcached

Image
Part of the tldr series , which summarizes papers I'm currently reading. Inspired by the morning paper . Scaling Memcache at Facebook [2013] Nishtala et al. Introduction  Although this paper does not introduce new ideas, it describes memcache, a real-world, large-scale, successful system deployed at Facebook. Specifically, Facebook uses memcache in two ways:  Query cache  to lighten read load on databases. See the figure above. Generic cache . For example, engineers can store pre-computed results from ML algorithms in memcached.  The system's workload was dominated by fetching data, which steered its designers towards caching. Additionally, the reads were from a range of sources (like MySQL databases, HDFS installations, and backend services), so it had to have a flexible caching strategy.  Memcached The authors describe how Facebook used  memcached , an in-memory key-value store running on a single server, to build a distributed key-value store. (Differ

tldr: Frangipani

Image
Part of the tldr series , which summarizes papers I'm currently reading. Inspired by the morning paper . Frangipani: A Scalable Distributed File System [1997] Thekkath et. al Introduction Frangipani is a scalable distributed file system that manages disks on multiple machines. Its designers focused on making it simple and easy to use. It achieves good performance via caching. Frangipani's built on top of Petal , a distributed storage system that provides virtual disks, each with a 2^64 byte address space. The above figure shows that Frangipani servers run on top of a shared Petal virtual disk and coordinate their actions with locks. More servers can be added to scale up.  System Overview  With Frangipani, all user programs see the same files, even if they are running on different machines. Furthermore, their views are coherent : changes to a file or directory on one machine are immediately visible to all other machines. (More on this in a later section.) Chang

tldr: Parameter Server

Image
Part of the tldr series , which summarizes papers I'm currently reading. Inspired by the morning paper . Scaled Distributed Machine Learning with the Parameter Server [2014] Li et. al  Introduction The motivation for this paper is efficiently solving large-scale machine learning problems by distributing the work over worker nodes. Fault tolerance is also a major goal of this paper: training tasks are often run in the cloud, which means having to deal with unreliable machines. Before going into parameter servers, here's a quick primer on machine learning in relation to systems. The goal of ML can be thought of as finding a model , which is just a function approximation. For example, the function could be  f(user profile) , which is equal to the likelihood distribution of that user clicking on an ad. Machine learning happens in two parts: Training . Exposing the model to training data so the model can improve. This is an iterative process. Inference . Testing th

tldr: Naiad

Image
Part of the tldr series , which summarizes papers I'm currently reading. Inspired by the morning paper . Naiad: A Timely Dataflow System [2013] Murray et. al Introduction Naiad is a distributed system for big data computation. It's motivated by computing on real-time streaming , incremental   data . (In contrast, Spark handles iterative data that's reused but unchanged, so Spark would have to start over if new data is added.) Naiad's goal is to get high performance through parallelism.  Timely Dataflow Naiad is based on a new computing model that the authors call timely dataflow . It's a directed graph with stateful vertices that send and receive messages along the edges. Furthermore, there are input vertices which receive a sequence of messages from an external producer, and output vertices , which send a sequence of messages to an external consumer. Finally, there may be nested cycles, called loop contexts . The figure below shows an example of

tldr: Spark

Image
Part of the tldr series , which summarizes papers I'm currently reading. Inspired by the morning paper . Spark (Resilient Distributed Datasets) [2012] Zaharia et al. Introduction  This paper is motivated by two real-world use cases where existing cluster computing frameworks (e.g. MapReduce ) are inefficient:  Iterative ML/graph algorithms Interactive data mining (multiple queries on same subset of data) Specifically, these use cases  reuse intermediate results in several computations. A framework like MapReduce would be inefficient because it has to write data to an external storage system in order to reuse it.  The authors instead introduce data structures called  resilient distributed datasets (RDDs) , which are designed to efficiently persist data in-memory for many kinds of applications. They implement RDDs in a system called  Spark . Additionally, they present an interface based on coarse-grained transformations (e.g. map or filter ).  Resilient Distrib