Abstracts::

Bridging Concurrency and Parallelism

Although they are closely related, the research areas of concurrency and parallelism have remained largely disjoint: their venues for publication and research foci differ. Whereas research on parallelism primarily concerns performance, research on concurrency primarily concerns correctness and liveness (e.g., deadlock freedom, lock-freedom, wait-freedom). In contrast, practical applications willingly employ concurrency and parallelism together: they exploit parallelism for performance and concurrency for interaction (e.g., games and servers). We are interested in developing a theory for such concurrent and parallel systems. In this talk, I briefly describe two results to this end. The first result extends a classic, graph-based model of parallel computation to support responsive interaction. The second result extends the SNZI data structure, a concurrent data structure for relaxed counters, to support O(1) time/work operations under a model of parallelism.

(Joint work with Naama Ben-David, Bob Harper, and Stefan Muller.)

Data Structures of the Future: Concurrent, Optimistic, and Relaxed

A central computing trend over the last decade has been the need to process increasingly larger amounts of data as efficiently as possible. This development is challenging both software and hardware design, and is altering the way data structures and algorithms are constructed, implemented, and deployed.

In this talk, I will present some examples of such new data structure design ideas and implementations. In particular, I will discuss some inherent limitations of parallelizing classic data structures, and then focus on approaches to circumvent these limitations. The main approach I will discuss to relax the software semantics, to allow for approximation, randomization, or both. Time permitting, I will also cover results showing that both approaches can improve real-world performance, and touch upon some of the major open questions in the area.

Anti-Persistence on Persistent Storage: History-Independent Sparse Tables and Dictionaries

We present history-independent data structures designed for external storage. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data structure that is not already available through the API.

The main result in this talk is a history independent sparse table, also called a packed-memory array. We also present two history-independent alternatives to the B-tree.

(Joint work with Jonathan W. Berry, Rob Johnson, Thomas M. Kroeger, Samuel McCauley, Cynthia A. Phillips, Bertrand Simon, Shikha Singh, and David Zage.)

De-randomizing online algorithms: an incomplete tale of two problems

The surprising results of Karp, Vazirani and Vazirani [1990] and (respectively) Buchbinder et al [2012] are examples where conceptually simple randomized online algorithms provide provably better approximations than the corresponding deterministic counterparts for online bipartite matching and (respectively) for unconstrained non-monotone submodular maximization and (submodular) max-sat. We show that some seemingly strong extensions of the deterministic online computation model can at best match the performance of these conceptually simple but surprising algorithms.

New lower bounds for the CONGEST model

I will present a new technique that we developed for constructing sparse graphs that allow proving strong lower bounds for the CONGEST model. In particular, this allows us to prove near-linear lower bounds for computing distances, and near-quadratic lower bounds for optimization problems. The latter are the first super-linear lower bounds for natural graph problems in this model. Finally, I will show a simple linear lower bound for exact weighted all-pairs-shortest-paths (APSP), which implies a separation between the weighted and unweighted cases, and I will also formally prove that the standard Alice-Bob framework is incapable of providing a super-linear lower bound for this problem.

(The talk is based on joint papers with Amir Abboud, Seri Khoury, and Ami Paz. No prior knowledge in distributed computing will be assumed.)

TBA

Computability and Complexity of Shared Memory Operations

The consensus number of a shared object is the maximum number of processes for which wait-free consensus can be solved using copies of this object and registers. It has been used to classify the computational power of shared objects. Surprisingly, we show that combining the operations of two different objects can significantly increase consensus number.

For randomized and obstruction-free consensus, only registers (supporting read and write) are sufficient. However, it was recently shown that that at least n-1 registers are necessary to solve randomized or obstruction-free consensus among n processes. We show that for other collections of operations, different numbers of shared locations are necessary and sufficient.

(Joint work with Rati Gelashvili, Nir Shavit and Leqi Zhu)

All-Pairs 2-Reachability in O(nω log n) Time

In the 2-reachability problem we are given a directed graph G and we wish to determine if there are two (edge or vertex) disjoint paths from u to v, for given pair of vertices u and v. In this paper, we present an algorithm that computes 2-reachability information for all pairs of vertices in O(nω log n) time, where n is the number of vertices and ω is the matrix multiplication exponent. Hence, we show that the running time of all-pairs 2-reachability is only within a log factor of transitive closure. Moreover, our algorithm produces a witness (i.e., a separating edge or a separating vertex) for all pair of vertices where 2-reachability does not hold. By processing these witnesses, we can compute all the edge- and vertex-dominator trees of G in O(n^2) additional time, which in turn enables us to answer various connectivity queries in O(1) time. For instance, we can test in constant time if there is a path from u to v avoiding an edge e, for any pair of query vertices u and v, and any query edge e, or if there is a path from u to v avoiding a vertex w, for any query vertices u, v, and w.

(Joint work with Daniel Graf, Giuseppe F. Italiano, Nikos Parotsidis, and Przemysław Uznański)

Multi-streaming Multi-Pattern Matching

The multi-stream computational model generalizes the classic streaming model. In this model there are several streams of data where each stream has its own stream memory and additional access to read-only shared memory storage, that is shared among all streams. We design an algorithm for the dictionary matching problem in the multi-stream model where the goal is to preprocess a dictionary D={P_1,P_2,...,P_d} of d=|D| patterns (strings over alphabet Sigma) so that given multiple streaming texts we can quickly report occurrences of patterns from D in each one of the texts as soon as they appear. Our new algorithm uses O(d log m) words in read-only shared memory, O(log m * log d) words in stream memory, and spends O(log m +log d * loglog d) time per input character.

(Joint work with Tsvi Kopelowitz and Ely Porat)

Lost in Translation: Production Code Efficiency

When a research prototype is re-implemented by software engineers, one often observes one to two order of magnitude drop in performance. This holds even if both implementations use the same language. The gap is somewhat wider when one goes from a traditional language (e.g., C++) to a more sophisticated language (e.g., Java).

The main cause of this phenomena is the interpretation by software engineers of what they learn in school. Theoretical computer scientists ignore constant factors for the sake of machine-independent analysis. Programming language faculty often focus on compilers that automatically handle low-level OS and architectural issues such as memory management. Software engineering professors emphasize specification and re-usability. Many software engineers learn to ignore constant factors, rely on compilers for low-level efficiency, and use general purpose primitives for re-usability. This is tempting to do as one has to worry about fewer issues when coding and learn fewer primitives.

However, in practice constant factors do matter, compilers do not always take advantage of computer architecture features, and general-purpose primitives may be less efficient than ones that are really needed for the task. Ignoring these issues can lead to significant loss of computational efficiency and increased memory consumption. Power consumption also increases significantly.

In this talk we give several examples of inefficient program fragments and discuss them. These examples show that software engineers need to pay attention to low-level details when choosing data structures and primitives, and avoid some inefficient coding practices.

Simulating Branching Programs with Edit Distance and Friends or: A Polylog Shaved is a Lower Bound Made

Backurs and Indyk (STOC'15) recently proved that the Edit Distance of two sequences of equal length cannot be computed in strongly subquadratic time under the Strong Exponential Time Hypothesis (SETH). The result was extended by follow-up works to similar problems such as finding the Longest Common Subsequence (LCS).

SETH is a very strong assumption: it asserts that even linear size CNF formulas cannot be analyzed for satisfiability with an exponential speedup over exhaustive search. We consider much safer assumptions, e.g. that such a speedup is impossible for SAT on much more expressive representations, like subexponential-size NC circuits.

Our main result is a reduction from SAT on Branching Programs to fundamental problems like Edit Distance and LCS. Truly subquadratic algorithms for these problems therefore have consequences that we consider to be far more remarkable than merely faster CNF SAT algorithms. An interesting feature of our work is that even mildly subquadratic algorithms imply new circuit lower bounds. For example, we show that if we can shave an arbitrarily large polylog factor from the complexity of Edit Distance then NEXP does not have non-uniform NC1 circuits.

(Joint work with Amir Abboud, Virginia Vassilevska Williams, and Ryan Williams.)

TBA

Decremental Single-Source Reachability in Planar Digraphs

We show a new algorithm for the decremental single-source reachability problem in directed planar graphs. It processes any sequence of edge deletions in O(n log2 n log log n) total time and explicitly maintains the set of vertices reachable from a fixed source vertex. Hence, if all edges are eventually deleted, the amortized time of processing each edge deletion is only O(log2 n log log n), which improves upon a previously known O(√) solution. We also show an algorithm for decremental maintenance of strongly connected components in directed planar graphs with the same total update time. These results constitute the first almost optimal (up to polylogarithmic factors) algorithms for both problems. To the best of our knowledge, these are the first dynamic algorithms with polylogarithmic update times on general directed planar graphs for non-trivial reachability-type problems, for which only polynomial bounds are known in general graphs

(Joint work with Adam Karczmarz, Jakub Lacki and Piotr Sankowski.)

Voronoi diagrams on planar graphs, and computing the diameter in deterministic $\tilde{O}(n^{5/3})$ time

We present a deterministic algorithm that computes the diameter of a directed planar graph with real arc lengths in $\tilde{O}(n^{5/3})$ time.\footnote{The $\tilde O$ notation hides polylogarithmic factors.} This improves the recent breakthrough result of Cabello (SODA'17), both by improving the running time (from $\tilde{O}(n^{11/6})$), and by using a deterministic algorithm. It is in fact the first truly subquadratic {\em deterministic} algorithm for this problem. Our algorithm follows the general high-level approach of Cabello, but differs in the way it is implemented. Our main technical contribution is an explicit and efficient construction of additively weighted Voronoi diagrams on planar graphs (under the assumption that all Voronoi sites lie on a constant number of faces). The idea of using Voronoi diagrams is also borrowed from Cabello, except that he uses abstract Voronoi diagrams (as a black box), which makes the implementation more involved, more expensive, and randomized. Our explicit construction avoids these issues, and leads to the improved (and deterministic) running time.

(joint work with Pawel Gawrychowski, Shay Mozes, Micha Sharir, and Oren Weimann)

Approximate Single-Source Shortest Paths in Distributed Networks

In this talk, I present recent advances on computing approximate solutions to the fundamental single-source shortest paths problem in the CONGEST model of distributed computing. I will outline two orthogonal frameworks for obtaining almost tight algorithms in undirected graphs. The first framework is combinatorial and revolves around the task of efficiently constructing a sparse hop set to speed up standard algorithms by reducing the hop diameter of the graph. The second framework leverages techniques from continuous optimization and actually addresses a more general problem: computing an approximate solution to the shortest transshipment problem. While the second framework gives superior running time bounds, improvements in the first framework would potentially transfer directly to the PRAM model.

(Joint work with Ruben Becker, Monika Henzinger, Andreas Karrenbauer, Christoph Lenzen, and Danupon Nanongkai.)

Challenges in Distributed Shortest Paths Algorithms

This talk focuses on the current challenges in computing distances and shortest paths in the distributed (CONGEST) setting, in particular exact algorithms and the directed case. I will survey recent progress, and try to point out connections to dynamic algorithms as well as spots major new ideas are needed. This talk assumes no prior knowledge in distributed computing.

Strong Connectivity in Directed Graphs under Failures, with Applications

Let G be a directed graph (digraph) with m edges and n vertices, and let G \ e (resp., G \ v) be the digraph obtained after deleting edge e (resp., vertex v) from G. We show how to compute in O(m + n) worst-case time:

• The total number of strongly connected components in G \ e (resp., G \ v), for all edges e (resp., for all vertices v) in G.
• The size of the largest and of the smallest strongly connected components in G \ e (resp., G \ v), for all edges e (resp., for all vertices v) in G.
Let G be strongly connected. We say that edge e (resp., vertex v) separates two vertices x and y, if x and y are no longer strongly connected in G \ e (resp., G \ v). We also show how to build in O(m+n) time O(n)-space data structures that can answer in optimal time the following basic connectivity queries on digraphs:
• Report in O(n) worst-case time all the strongly connected components of G \ e (resp., G \ v), for a query edge e (resp., vertex v).
• Test whether an edge or a vertex separates two query vertices in O(1) worst-case time.
• Report all edges (resp., vertices) that separate two query vertices in optimal worst-case time, i.e., in time O(k), where k is the number of separating edges (resp., separating vertices). (For k = 0, the time is O(1)).
All our bounds are tight and are obtained with a common algorithmic framework, based on a novel compact representation of the decompositions induced by 1-edge and 1-vertex cuts in digraphs, which might be of independent interest. With the help of our data structures we can design efficient algorithms for several other connectivity problems on digraphs and we can also obtain in linear time a strongly connected spanning subgraph of G with O(n) edges that maintains the 1-connectivity cuts of G and the decompositions induced by those cuts.

(Joint work with Loukas Georgiadis and Giuseppe F. Italiano)

Fault tolerant subgraph for single source reachability: generic and optimal

Let G=(V,E) be an n-vertices m-edges directed graph. Let s∈V be any designated source vertex. We address the problem of single source reachability (SSR) from s in presence of failures of vertices/edges. We show that for every k≥1, there is a subgraph H of G with at most 2kn edges that preserves the reachability from s even after the failure of any k edges. Formally, given a set F of k edges, a vertex u∈V is reachable from s in G∖ F if and only if u is reachable from s in H∖F. We call H a k-Fault Tolerant Reachability Subgraph (k-FTRS). We prove also a matching lower bound of Ω(2kn) for such subgraphs. Our results extend to vertex failures without any extra overhead. The general construction of k-FTRS is interesting from several different perspectives. From the Graph theory perspective it reveals a separation between SSR and single source shortest paths (SSSP) in directed graphs. More specifically, in the case of SSSP in weighted directed graphs, there is a lower bound of Ω(m) even for a single edge failure. In the case of unweighted graphs there is a lower bound of Ω(n3/2) edges, again, even for a single edge failure. There is also a matching upper bound but nothing is known for two or more failures in the directed graphs. From the Algorithms perspective it implies fault tolerant solutions to other interesting problems, namely, (i) verifying if the strong connectivity of a graph is preserved after k edge or vertex failures, (ii) computing a dominator tree of a graph after k-failures. From the perspective of Techniques it makes an interesting usage of the concept of farthest min-cut which was already introduced by Ford and Fulkerson in their pioneering work on flows and cuts. We show that there is a close relationship between the farthest min-cut and the k-FTRS. We believe that our new technique is of independent interest.

Interactive communication for large networks

A group of n users want to run a distributed protocol π over a network where communication occurs via private point-to-point channels. Unfortunately, an adversary, who knows π, is able to maliciously flip bits on the channels. Can we efficiently simulate π in the presence of such an adversary?

We show that this is possible, even when L, the number of bits sent in π, and T, the number of bits flipped by the adversary are not known in advance. In particular, we show how to create a robust version of π that 1) fails with probability at most ε, for any ε>0; and 2) sends Õ(L + T) bits, where the Õ notation hides a log (n(L+T)/ ε) term multiplying L.

Additionally, we show how to improve this result when the average message size α is not constant. In particular, we give an algorithm that sends O( L (1 + 1/α log (n(L+T)/ ε) + T) bits. This algorithm is adaptive in that it does not require a priori knowledge of α. We note that if α is log (n(L+T)/ ε), then this improved algorithm sends only O(L+T) bits.

Dynamic Spanning Forest with Worst-Case Update Time

We will discuss recent algorithms for dynamically maintaining a spanning forest of a graph undergoing edge insertions and deletions. We provide the "first polynomial improvement" over the long-standing O(√) bound of [Frederickson STOC'83, Eppstein et. al FOCS'92] for dynamic spanning forest problem. (Independently, Wulff-Nilsen shows the first polynomial improvement for dynamic minimum spanning forest problems.)

We will cover some relevant techniques on expanders: 1) an algorithm for decomposing a graph into a collection of expanders in near-linear time, and 2) an algorithm for "repairing" the expansion property of an expander after deleting some edges. These techniques can be of independent interest.

Concurrent Disjoint Set Union

The disjoint set union problem is a classical problem in data structures with a simple and efficient sequential solution that has a notoriously complicated analysis. One application is to find strongly connected components in huge, implicitly defined graphs arising in model checking. In this application, the use of multiprocessors has the potential to produce significant speedups. We explore this possibility. We devise and analyze concurrent versions of standard sequential algorithms that use single and double compare-and-swap primitives for synchronization, making them wait-free. We obtain work bounds that grow logarithmically with the number of processors, suggesting the possibility of significant speedup in practice.

(Joint work with Siddhartha Jayanti, an undergraduate at Princeton.)

TBA

Route Planning for Zone-Based Routing

Zone-Based Routing (ZBR) is a framework for high-volume parcel delivery. During a preprocessing step, the geography serviced by a delivery station is split into fixed “zones,” each with bounded (and small) expected demand. Each zone corresponds to a sort point (bag) at the station. The preprocessing step also organizes zones along a fixed order (the “master trace”) through the geography, with sort points within the station laid out accordingly. Shipments are sorted into these sort points as they arrive at the station, typically through the night. Once all shipments have been received, a route planner splits the trace into routes, each consisting of a contiguous subsequence of zones. The splits vary day to day depending on demand, and consider vehicle capacities and shift times. The ZBR framework produces routes that are dense and consistent, seamlessly accommodates large fluctuations in demand, and scales well by permitting sortation to start before the precise set of packages is known. This work describes an efficient routing algorithm for ZBR, with focus on its performance and scalability. We show that, even though the problem is NP-hard, it is fixed-parameter tractable with the number of shipments per zone as a parameter. Leveraging this and through careful engineering, we can find the optimal solution in most realistic circumstances, solving instances with tens of thousands of shipments in seconds.

(Joint work with Daniel Fleischman, Andrii Gomilko, Anicham Kumarasamy, Mauricio G. C. Resende, and Michael N. Rice.)

Fully-Dynamic Minimum Spanning Forest with Improved Worst-Case Update Time

We give a Las Vegas data structure which maintains a minimum spanning forest in an n-vertex edge-weighted dynamic graph undergoing updates consisting of any mixture of edge insertions and deletions. Each update is supported in O(n1/2-c) expected worst-case time for some constant c > 0 and this worst-case bound holds with probability at least 1 - n-d where d is a constant that can be made arbitrarily large. This is the first data structure achieving an improvement over the O(√) deterministic worst-case update time of Eppstein et al., a bound that has been standing for nearly 25 years. In fact, it was previously not even known how to maintain a spanning forest of an unweighted graph in worst-case time polynomially faster than Θ(√). Our result is achieved by first giving a reduction from fully-dynamic to decremental minimum spanning forest preserving worst-case update time up to logarithmic factors. Then decremental minimum spanning forest is solved using several novel techniques, one of which involves keeping track of low-conductance cuts in a dynamic graph. An immediate corollary of our result is the first Las Vegas data structure for fully-dynamic connectivity where each update is handled in worst-case time polynomially faster than Θ(√) w.h.p.; this data structure has O(1) worst-case query time.

(To appear at STOC 2017)

TBA

Selection in heaps and row-sorted matrices using soft heaps

Abstract

We obtain simpler optimal algorithms for selecting the k-th largest item from a heap-ordered tree, and from a collection of sorted lists, using soft heaps. Our results match, an in slight ways improve, on classical results of Frederickson (1993) and Frederickson and Johnson (1982). .

(Joint work with Haim Kaplan, Laszlo Kozma and Or Zamir)