# TOC Seminar '14-'15

## The Independent Set problem on Degree-d Graphs

Anupam Gupta, CMU
Monday, May 11, 2015 - 1:00pm to 2:30pm
Pierce 213

The independent set problem on graphs with maximum degree d is known to be Omega(d/log^2 d) hard to approximate, assuming the unique games conjecture. However, the best approximation algorithm was worse by about an Omega(log d) factor. In a recent breakthrough, Bansal showed how to use few rounds of the SA+ hierarchy to estimate the size of the optimal independent set to within O~(d/log^2 d), essentially closing the gap. Some questions remained: could we find such an IS? And did we really need the SA+ lifting step?

In this talk, we show two results. Firstly, the standard SDP, based on the Lovasz Theta function, gives a O~(d/log^{3/2} d) approximation without using any lift/project steps. Secondly, using SA+, one can convert Bansal's algorithm for IS size estimation into an approximation algorithm. Both results, just like Bansal's, are based on Ramsay- theoretic results of independent interest.

This is based on joint work with Nikhil Bansal (TU Eindhoven) and Guru Guruganesh (CMU).

## Dynamic Maximum Matching and Related Problems

Shay Solomon, Weizmann Institute of Science
Monday, May 4, 2015 - 1:00pm to 2:30pm
Pierce 213

Graph matching is one of the most well-studied problems in combinatorial optimization, with applications ranging from scheduling and object recognition to numerical analysis and computational chemistry. Nevertheless, until recently very little was unknown about this problem in real-life dynamic networks, which aim to model the constantly changing physical world.

In the first part of the talk we will discuss our work on the dynamic maximum matching problem. In the second part of the talk we will highlight some of our work on a few related problems in both centralized and distributed systems.

## Multidimensional $\epsilon$-Approximate Agreement and Computability in Byzantine Systems

Hammurabi Mendes, Brown University
Monday, April 27, 2015 - 1:00pm to 2:30pm
Pierce 213

This talk is divided in two parts. In the first part, we discuss a problem called multidimensional $\epsilon$-approximate agreement. Consider a distributed system with asynchronous communication and Byzantine (i.e., arbitrary) failures. Each process inputs a value in $\mathbb{R}^d$ with $d \ge 1$, and all non-faulty processes must finish with values where: (1) outputs lie within a distance $\epsilon$ of each other; and (2) outputs are in the convex hull of non-faulty process inputs. This problem generalizes the traditional $\epsilon$-approximate agreement of 1983/1986, and has implications to computability for more general Byzantine tasks.

In the second part, we characterize computability in Byzantine, asynchronous systems by using tools adapted from combinatorial topology. Tasks are formalized with a pair of combinatorial structures called simplicial complexes, one for non-faulty process inputs (the input complex), and another for non-faulty process outputs (the output complex). A map between the input complex and the output complex defines task semantics. We see how a Byzantine asynchronous task is solvable if and only if a "dual" asynchronous, crash-failure task is solvable as well. We are then able to characterize computability for Byzantine asynchronous tasks with a concise, topology-based language.

## Recent progress in structure of large treewidth graphs and some applications

Chandra Chekuri, University of Illinois, Urbana-Champaign
Monday, April 13, 2015 - 1:00pm to 2:30pm
Pierce 213

The seminal work of Robertson and Seymour on graph minors developed and utilized several important properties of tree decompositions and treewidth. Treewidth has since become a fundamental tool for structural and algorithmic results on graphs. One of the key results of Robertson and Seymour is the Excluded Grid Theorem which states that there is an integer valued function f such that that every graph G with treewidth at least f(k) contains a k x k grid as a minor.

In this talk we will discuss some recent developments on the structure of graphs with large treewidth. In particular, Julia Chuzhoy and the author showed that f can chosen to be a polynomial, improving previous bounds that were at least exponential. We will discuss this and related structural results and some applications to approximation algorithms for routing, fixed parameter tractability and Erdos-Posa type theorems.

## Secretary Problems with Non-Uniform Arrival Order

Bobby Kleinberg, Cornell and MSR New England
Monday, March 30, 2015 - 1:00pm to 2:30pm
Pierce 213

For a number of problems in the theory of online algorithms, the assumption that elements arrive in uniformly random order enables the design of algorithms with much better performance guarantees than under worst-case assumptions. The quientessential example of this phenomenon is the secretary problem, in which an algorithm attempts to stop a sequence at the moment it observes the element of maximum value. As is well known, if the sequence is presented in uniformly random order there is an algorithm that succeeds with probability 1/e, whereas no non-trivial performance guarantee is possible if the elements arrive in worst-case order.

In many applications of online algorithms, it is reasonable to assume there is some randomness in the input sequence, but unreasonable to assume that the arrival ordering is uniformly random. This work initiates an investigation into relaxations of the random-ordering hypothesis, by focusing on the secretary problem and asking what performance guarantees one can prove under relaxed assumptions. Along the way to answering this question we will encounter some tools, such as coding theory and approximation theory, not normally associated with the analysis of online algorithms.

## Relational verification of Differential Privacy and Mechanism Design...for a theory audience

Marco Gaboardi, University of Dundee and Visiting Scholar at Harvard CRCS
Monday, March 9, 2015 - 1:00pm to 2:30pm
Pierce 213

Programming language research has developed a wide collection of techniques useful for reasoning about different correctness properties of programs. Some of these techniques can be tailored to formally verify differential privacy, and mechanism design properties like bayesian incentive compatibility. In this talk I will introduce the basic ingredient of these techniques by emphasizing the reasoning principles they capture and why they are useful for reasoning about differential privacy and mechanism design. I will also discuss the limitations of this approach and the motivations for further works in this area.

## Higher lower bounds from the 3SUM conjecture

Tsvi Kopelowitz, University of Michigan
Monday, March 2, 2015 - 1:00pm to 2:30pm
Pierce 213

The 3SUM hardness conjecture has proven to be a valuable and popular tool for proving conditional lower bounds on the complexities of dynamic data structures and graph problems. This line of work was initiated by Patrascu [STOC 2010] and has received a lot of recent attention. Most of these lower bounds are based on reductions from 3SUM to a special set intersection problem introduced by Patrascu, which we call Patrascu's Problem. However, the framework introduced by Patrascu that reduces 3SUM to Patrascu's Problem suffers from some limitations, which in turn produce polynomial gaps between the achievable lower bounds via this framework and the known upper bounds.

We address these issues by providing a tighter and more versatile framework for proving 3SUM lower bounds via a new reduction to Patrascu's Problem. Furthermore, our framework does not become weaker if 3SUM can be solved in truly subquadratic time, and provides some immediate higher conditional lower bounds for several problems, including for set intersection data-structures. For some problems, the new higher lower bounds meet known upper bounds, giving evidence to the optimality of such algorithms.

During the talk, we will discuss this new framework, and show some new (optimal) lower bounds conditioned on the 3SUM hardness conjecture. In particular, we will demonstrate how some old and new triangle listing algorithms are optimal for any graph density, and prove a conditional lower bound for incremental Maximum Cardinality Matching which introduces new techniques for obtaining amortized lower bounds.

## For-all Sparse Recovery in Near-Optimal Time

Yi Li, Harvard
Monday, January 26, 2015 - 1:00pm to 2:30pm
Pierce 213

An approximate sparse recovery system in $\ell_1$ norm consists of parameters $k$, $\epsilon$, $N$, an $m$-by-$N$ measurement $\Phi$, and a recovery algorithm, $\mathcal{R}$. Given a vector, $\mb{x}$, the system approximates $x$ by $\widehat{\mb{x}} = \mathcal{R}(\Phi\mb{x})$, which must satisfy $\|\widehat{\mb{x}}-\mb{x}\|_1 \leq (1+\epsilon)\|\mb{x}-\mb{x}_k\|_1$. We consider the for all'' model, in which a single matrix $\Phi$, possibly constructed'' non-explicitly using the probabilistic method, is used for all signals $\mb{x}$.

The best existing sublinear algorithm uses $O(\epsilon^{-3} k\log(N/k))$ measurements and runs in time $O(k^{1-\alpha}N^\alpha)$ for any constant $\alpha > 0$. In this paper, we improve the number of measurements to $O(\epsilon^{-2} k \log(N/k))$, matching the best existing upper bound (attained by super-linear algorithms), and the runtime to $O(k^{1+\beta}\poly(\log N,1/\epsilon))$, with a modest restriction that $\epsilon \leq (\log k/\log N)^{\gamma}$, for any constants $\beta,\gamma > 0$. When $k\leq \log^c N$ for some $c>0$, the runtime is reduced to $O(k\poly(N,1/\epsilon))$.

## Private Information Retrieval with 2-Servers and sub-polynomial communication

Zeev Dvir, Princeton
Monday, December 15, 2014 - 1:00pm to 2:30pm
Pierce 213

A 2-server Private Information Retrieval (PIR) scheme allows a user to retrieve the i'th bit of an n-bit database replicated among two servers (which do not communicate) while not revealing any information about i to either server. The privacy of the user is information theoretic and does not rely on any cryptographic assumptions. In this work we construct a new 2-server PIR scheme with total communication cost sub-polynomial in n. This improves over the currently known 2-server protocols which require n^{1/3} communication and matches the communication cost of known 3-server PIR schemes. Our improvement comes from reducing the number of servers in existing protocols, based on Matching Vector Codes, from 3 or 4 servers to 2. This is achieved by viewing these protocols in an algebraic way (using polynomial interpolation) and extending them using partial derivatives.

Joint work with Sivakanth Gopi (Princeton).

## Approximating the best Nash Equilibrium in n^{o(log n)}-time breaks ETH

Omri Weinstein, Princeton
Monday, December 8, 2014 - 1:00pm to 2:30pm
Pierce 213

The celebrated PPAD hardness result for finding an exact Nash equilibrium in a two-player game initiated a quest for finding *approximate* Nash equilibria efficiently, and is one of the major open questions in algorithmic game theory. We study the computational complexity of finding an \eps-approximate Nash equilibrium with good social welfare. Hazan and Krauthgamer and subsequent improvements showed that finding an epsilon-approximate Nash equilibrium with good social welfare in a two player game and many variants of this problem is at least as hard as finding a planted clique of size O(log n) in the random graph g(n,1/2). We show that any polynomial time algorithm that finds an eps-approximate Nash equilibrium with good social welfare refutes (the worst-case) Exponential Time Hypothesis by Impagliazzo and Paturi. Specifically, it would imply a 2^O(\sqrt{n}) algorithm for SAT. Our lower bound matches the quasi-polynomial time algorithm by Lipton, Markakis and Mehta for solving the problem. Our key tool is a reduction from the PCP machinery to finding Nash equilibrium via free games, the framework introduced in the recent work by Aaronson, Impagliazzo and Moshkovitz. Techniques developed in the process may be useful for replacing planted clique hardness with ETH-hardness in other applications.

Joint work with Mark Braverman and Young Kun Ko.

## Randomized Symmetry Breaking and the Constructive Lovász Local Lemma

Seth Pettie, University of Michigan
Monday, December 1, 2014 - 1:00pm to 2:30pm
Pierce 213

Symmetry breaking problems pervade every area of distributed computing. The devices in a distributed system are often assumed to be initially undifferentiated, except for having distinct IDs. In order to accomplish basic tasks they must break this initial symmetry. In distributed graph algorithms (where the communications network and the input graph are identical) some symmetry breaking tasks include computing maximal matchings, vertex- and edge-colorings, and maximal independent sets (MIS). Running time is measured in rounds of communication.

In this talk I'll present two general randomized methods for solving symmetry breaking problems. The first can be applied to "decomposable" problems (such as MIS or maximal matching). The second can be applied to a broader class of problems; it uses an efficient distributed version of the constructive Lovász Local Lemma.

Joint work with Hsin-Hao Su, Kai-Min Chung, Leonid Barenboim, Michael Elkin, and Johannes Schneider. Preliminary results appeared in FOCS 2012 and PODC 2014.

## Fast algorithms for optimization of submodular functions

Monday, November 24, 2014 - 1:00pm to 2:30pm
Pierce 213

Much progress has been made on problems involving optimization of submodular functions under various constraints. However, the resulting algorithms, in particular the ones based on the "multilinear relaxation", are often quite slow. In this talk, I will discuss some recent efforts on making these algorithms faster and more practical.

We design near-linear and near-quadratic algorithms for maximization of submodular functions under several common constraints. The techniques that we use include ground set sparsification, a coarse variant of the continuous greedy algorithm, and multiplicative weight updates. Some of the new algorithms have been implemented and showed improvements on instances arising in active set selection for nonparametric learning and exemplar-based clustering.

Based on recent works with

1. C. Chekuri and T.S. Jayram
2. B. Mirzasoleiman, A. Badanidiyuru, A. Karbasi and A. Krause

## Exponential Separation of Information and Communication

Gillat Kol, IAS
Monday, November 17, 2014 - 1:00pm to 2:30pm
Pierce 213

In profoundly influential works, Shannon and Huffman show that if Alice wants to send a message X to Bob, it's sufficient for her to send roughly H(X) bits (in expectation), where H denotes Shannon's entropy function. In other words, the message x can be compressed to roughly H(X) bits, the information content of the message. Can one prove similar results in the interactive setting, where Alice and Bob engage in an interactive communication protocol?

We show the first gap between communication complexity and information complexity, by giving an explicit example of a partial boolean function with information complexity O(k), and distributional communication complexity > 2^k. This shows that a communication protocol cannot always be compressed to its internal information, answering (the standard formulation of) the above question in the negative. By a result of Braverman, our example gives the largest possible gap.

By a result of Braverman and Rao, our example gives the first gap between communication complexity and amortized communication complexity, implying that strong direct sum does not hold for distributional communication complexity, answering a long standing open problem.

## Graph Sparsification in the Streaming Model

Christopher Musco, MIT
Monday, November 10, 2014 - 1:00pm to 2:30pm
Pierce 213

Streaming algorithms have received significant attention for their power in processing dynamically changing data under space constraints. As graph datasets grow (e.g. social networks) and graphs are increasingly used to model non-linked datasets (e.g. spectral clustering), a rich subfield has developed around streaming algorithms specifically designed for processing graphs.

One powerful tool for compressing the storage required for a graph is "graph sparsification". It is possible to eliminate most of the edges from a dense graph while still maintaining much of the graph's structural information. This talk will focus on computing graph sparsifiers in the streaming model. Specifically, we consider how to dynamically maintain a graph sparsifier as edges are continually inserted into and deleted from a graph.

I will review graph sparsification and prior work on streaming algorithms for the problem. Then I'll discuss our recent result, which is the first algorithm for computing spectral graph sparsifiers from a stream of edge insertions and deletions in essentially optimal space. The result can be viewed as a sparse-recovery type algorithm for graph data, extending a powerful technique introduced by Ahn, Guha, and McGregor.

(Joint work with Michael Kapralov, Yin Tat Lee, Cameron Musco, and Aaron Sidford. See: http://arxiv.org/abs/1407.1289)

## Learning Halfspaces with Noise

Pranjal Awasthi, Princeton
Monday, November 3, 2014 - 1:00pm to 2:30pm
Pierce 213

We study the problem of learning halfspaces in the malicious noise model of Valiant. In this model, an adversary can corrupt an η fraction of both the label part and the feature part of an example. We design a polynomial-time algorithm for learning halfspaces in R^d under the uniform distribution with near optimal noise tolerance.

Our results also imply the first active learning algorithm for learning halfspaces that can handle malicious noise.

Joint work with Nina Balcan and Phil Long.

## Constant-time Testing and Learning Algorithms for Image Properties

Monday, October 27, 2014 - 1:00pm to 2:30pm
Pierce 213

We initiate a systematic study of sublinear-time algorithms for image analysis that have access only to labeled random samples from the input. Most previous sublinear-time algorithms for image analysis were query-based, that is, they could query pixels of their choice. We consider algorithms with two types of input access: sample-based algorithms that draw independently random pixels, and block-sample-based algorithms that draw pixels from independently random square blocks of the image. We investigate three basic properties of black-and-white images: being a half-plane, convexity and connectedness. For the first two properties, all our algorithms are sample-based, and for connectedness they are block-sample-based. All algorithms we present have low sample complexity that depends only on the error parameter, but not on the input size.

We design algorithms that approximate the distance to the three properties within a small additive error or, equivalently, tolerant testers for being a half-plane, convexity and connectedness. Tolerant testers for these properties, even with query access to the image, were not investigated previously. Tolerance is important in image processing applications because it allows algorithms to be robust to noise in the image. We also give (non-tolerant) testers for convexity and connectedness with better complexity than implied by our distance approximation algorithms. Our testers are faster than previously known query-based testers.

To obtain our algorithms for convexity, we design two fast proper PAC learners of convex sets in two dimensions that work under the uniform distributions: non-agnostic and agnostic..

(Joint work with Piotr Berman and Meiram Murzabulatov)

## Path-Finding Methods for Linear Programming

Aaron Sidford, MIT
Monday, October 6, 2014 - 1:00pm to 2:30pm
Pierce 213

In this talk I will present a new algorithm for solving linear programs. Given a linear program with n variables, m > n constraints, and bit complexity L, our algorithm runs in Õ(sqrt(n) L) iterations each consisting of solving Õ(1) linear systems and additional nearly linear time computation. Our method improves upon the convergence rate of previous state-of-the-art linear programming methods which required solving either Õ(sqrt(m)L) linear systems [R88] or consisted of Õ((mn)^(1/4)) steps of more expensive linear algebra [VA93].

Interestingly, our algorithm not only nearly matches the convergence rate of the universal barrier of Nesterov and Nemirovskii [NN94], but in the special case of the linear programming formulation of various flow problems our methods converge at a rate faster than that predicted by any self-concordant barrier. In particular, we achieve a running time of Õ(|E| sqrt(|V| log^2 U) for solving the maximum flow problem on a directed graph with |E| edges, |V| vertices, and capacity ratio U, thereby improving upon the previous fastest running time for solving this problem when |E| > Ω(|V|^epsilon) for any constant epsilon.

This talk will assume little exposure to linear programming algorithms, convex optimization, or graph theory and will require no previous experience with the universal barrier or self-concordance.

This talk reflects joint work with Yin Tat Lee. See http://arxiv.org/abs/1312.6677 and http://arxiv.org/abs/1312.6713.

## The Coordinated-Attack Problem Revisited

Eli Gafni, UCLA
Monday, September 29, 2014 - 1:00pm to 2:30pm
Pierce 213

The Coordinated Attack problem was posed and proved impossible in 1975. It held the promise of creating a new kind of reasoning about computing: the field of distributed algorithms.

It did not pan out that way. Only old-timers still remember this problem. There was not much to learn from it, since unlike the impossibility result of Fischer, Lynch and Patterson (FLP) for asynchronous agreement with a single fault that followed in 1983, straightforward variants of the underlying problem turned out to be trivial and uninteresting. In contrast, with FLP, the model of just a single fault still allowed for lesser coordination mechanisms than agreement that were still non-trivial. Indeed, the discovery a decade later of the connection between algebraic topology and distributed algorithms can be traced back to the FLP result. Thus FLP, rather than the Coordinated Attack, delivered on the original promise.

In this talk, Dr. Gafni will show a simple tweak to the Coordinated Attack problem that allows for some coordination. This possible coordination leads directly and simply to the FLP impossibility result, and the subsequent connection between distributed computing and algebraic topology.

## Local Reductions

Emanuele Viola, Northeastern University and Visiting Scholar at Harvard
Monday, September 22, 2014 - 1:00pm to 2:30pm
Pierce 213

We reduce non-deterministic time T > 2^n to a 3SAT instance phi of size |phi| = T polylog T such that there is an explicit circuit C that on input an index i of log |phi| bits outputs the i-th clause, and each output bit of C depends on O(1) inputs bits. The previous best result was C in NC^1. Even in the simpler setting of |phi| = poly(T) the previous best result was C in AC^0.

We also somewhat optimize the complexity of PCP reductions.

As an application, we tighten Williams' connection between satisfiability (or derandomization) and circuit lower bounds. The original connection employed previous reductions as black boxes. If one instead employs the reductions above as black boxes then the connection is more direct.

Based on joint works with Hamid Jahanjou and Eric Miles, and with Eli Ben-Sasson.