# TOC Seminar '17-'18

## "Quantum Supremacy" and the Complexity of Random Circuit Sampling

Bill Fefferman  - UC Berkeley

Monday, May 14, 2018 - 1:00pm to 2:30pm

Maxwell-Dworkin 119

A critical goal for the field of quantum computing is quantum supremacy -- a demonstration of a quantum computation that is prohibitively hard for classical computers. Besides dispelling any skepticism about the experimental viability of quantum computers, quantum supremacy also provides a test of quantum theory in the realm of high complexity. A leading near-term candidate, put forth by the Google/UCSB team is sampling from the probability distributions of randomly chosen quantum circuits, called Random Circuit Sampling (RCS).

While RCS was defined with experimental realization in mind, we give complexity-theoretic evidence of classical hardness of RCS, placing it on par with the best theoretical proposals for supremacy. Specifically, we show that RCS satisfies an average-case hardness condition -- computing output probabilities of typical quantum circuits is as hard as computing them in the worst-case, and therefore #P-hard. Our reduction exploits the polynomial structure in the output amplitudes of random quantum circuits, enabled by the Feynman path integral. We also describe a new verification measure which in some formal sense maximizes the information gained from experimental samples.

## Sampling-based Approximation Algorithms for Data Analysis using Rademacher Averages

Matteo Riondato  - Two Sigma Investments

Monday, May 7, 2018 - 1:00pm to 2:30pm

Maxwell-Dworkin 223

A powerful way to speed up the exploratory analysis of a large dataset is to analyze a random sample of it fitting in the main memory of the machine. How large should this sample be to guarantee that the results of the exploration are accurate? Previous approaches used classic tail inequalities and the union bound, resulting in very loose trade-offs between sample size and accuracy.  In recent works we showed how to use Rademacher averages, fundamental concepts from statistical learning theory, as a key component of sampling-based approximation algorithms for important tasks in data analysis. Rademacher averages were long considered to be only of theoretical interest, and only for the study of machine learning algorithms, and our results instead are witnesses to their wide applicability and practical impact.

In this talk, I will first introduce the key concepts and results involving Rademacher averages, and then show how we used them for sampling-based algorithms for frequent pattern mining and for estimating centrality values in large graphs.

## Towards faster and optimal algorithms for submodular maximization

Alina Ene  - Boston University

Monday, April 16, 2018 - 1:00pm to 2:30pm

Pierce 213

The core mathematical problem underpinning many of the applications of submodularity is the meta problem of maximizing a submodular objective function subject to some constraints. Constrained submodular maximization has received considerable attention over the years, leading to the development of approximation algorithms for a rich class of constraints and numerous applications in machine learning, data mining, and beyond.

Despite this remarkable progress, several fundamental challenges remain, both in terms of understanding the approximability of central problems and improving the running times. In this talk, we highlight recent progress on both fronts that leverage both continuous and discrete optimization. Our algorithms use fast iterative algorithms and data structures to obtain nearly-optimal approximation guarantees in nearly-linear time for important classes of constraints, such as matroid and knapsack constraints.

This talk is based on joint work with Huy Nguyen (Northeastern University).

## Hitting Sets with Near-Optimal Error for Read-Once Branching Programs

Gil Cohen  - Princeton University

Monday, April 2, 2018 - 1:00pm to 2:30pm

Pierce 213

In this talk, we present an explicit construction of a hitting set for read-once branching programs that has near-optimal dependence on the error parameter. This yields the first improvement over the hitting set induced from Nisan's seminar work (Combinatorica'92).

Joint work with Mark Braverman and Sumegha Garg.

## Nearly Work-Efficient Parallel Algorithm for Digraph Reachability

Jeremy Fineman  - Georgetown University

Monday, March 26, 2018 - 1:00pm to 2:30pm

Pierce 213

One of the simplest problems on directed graphs is that of identifying the set of vertices reachable from a designated source vertex. This problem can be solved easily sequentially by performing a graph search, but efficient parallel algorithms have eluded researchers for decades. For sparse high-diameter graphs in particular, there is no previously known parallel algorithm with both nearly linear work and nontrivial parallelism. This talk presents the first such algorithm. Specifically, this talk presents a randomized parallel algorithm for digraph reachability with work O(m*polylog(n)) and span O(n^{2/3}*polylog(n)), with high probability, where n is the number of vertices and m is the number of arcs.

The main technical contribution is an efficient Monte Carlo algorithm that, through the addition of O(n*polylog(n)) shortcuts, reduces the diameter of the graph to O(n^{2/3}*polylog(n)) with high probability. While there are existing algorithms that achieve similar guarantees, they are not efficient; the sequential algorithms have running times much worse than linear. This talk presents a surprisingly simple sequential algorithm with running time O(m log^2(n)) that achieves the stated diameter reduction. Parallelizing that algorithm yields the main result, but doing so involves overcoming several additional challenges.

## The Distance Oracle Hierarchy

Greg Bodwin  - MIT

Monday, February 26, 2018 - 1:00pm to 2:30pm

Pierce 213

A lot of well-studied problems in CS Theory are about making “sketches” of graphs that occupy much less space than the graph itself, but where the shortest path distances of the graph can still be approximately recovered from the sketch.  For example, in the literature on Spanners, we seek a sparse subgraph whose distance metric approximates that of the original graph.  In Emulator literature, we relax the requirement that the approximating graph is a subgraph.  Most generally, in Distance Oracles, the sketch can be an arbitrary data structure, so long as it can approximately answer queries about the pairwise distance between nodes in the original graph.

Research on these objects typically focuses on optimizing the worst-case tradeoff between the quality of the approximation and the amount of space that the sketch occupies.  In this talk, we will survey a recent leap in understanding about this tradeoff, overturning the conventional wisdom on the problem.  Specifically, the tradeoff is not smooth, but rather it follows a new discrete hierarchy in which the quality of the approximation that can be obtained jumps considerably at certain predictable thresholds.  The proof is graph-theoretic and relies on building large families of graphs with large discrepancies in their metrics.

## A Constant-factor Approximation Algorithm for the Asymmetric Traveling Salesman Problem

Ola Svensson  - EPFL

Monday, February 12, 2018 - 1:00pm to 2:30pm

Pierce 213

We give a constant-factor approximation algorithm for the asymmetric traveling salesman problem. Our approximation guarantee is analyzed with respect to the standard LP relaxation, and thus our result confirms the conjectured constant integrality gap of that relaxation.

Our techniques build upon the constant-factor approximation algorithm for the special case of node-weighted metrics. Specifically, we give a generic reduction to structured instances that resemble but are more general than those arising from node-weighted metrics. For those instances, we then solve Local-Connectivity ATSP, a problem known to be equivalent (in terms of constant-factor approximation) to the asymmetric traveling salesman problem.

This is joint work with Jakub Tarnawski and László Végh.

## k-server and metrical task systems via multiscale entropic regularization

Sebastien Bubeck - Microsoft Research

Monday, January 29, 2018 - 1:00pm to 2:30pm

Pierce 213

The k-server problem and its generalization to metrical task systems are classical problems in online computation. Both problems have a long-standing conjecture about the optimal competitive ratio in arbitrary metric spaces, namely O(log(k)) for k-server and O(log(n)) for MTS. In this talk I will describe a new approach to these questions based on the mirror descent strategy, a well-known convex optimization tool that has been used extensively in the last decade for online learning, but which has not been so prevalent (yet) in online algorithms. In addition to the simplicity of the framework, it also improves the state of the art bound for hierarchically separated trees to respectively O(log^2(k)) and O(log(n)) (this implies the same bound up to an additional log(n) factor for arbitrary metric spaces).

Joint work with Michael B. Cohen, James R. Lee, Yin Tat Lee, and Aleksander Madry.

## Constructive Discrepancy Minimization with Hereditary L2 Guarantees

Kasper Green Larsen - Aarhus University

Monday, November 27, 2017 - 1:00pm to 2:30pm

Maxwell-Dworkin 221

In discrepancy minimization problems, we are given a family of sets F={S1,...,Sm}, with each SiF a subset of some universe U={u1,...,un} of n elements. The goal is to find a coloring χ : U→{-1,+1} of the elements of U such that each set SF is colored as evenly as possible. Two classic measures of discrepancy are -discrepancy defined as disc(F,χ) : =maxSFuiSχ(ui)| and2-discrepancy defined as disc2(F,χ) : =((1/|F|)ΣSFuiSχ(ui))2)1/2. Breakthrough work by Bansal [FOCS'10] gave a polynomial time algorithm, based on rounding an SDP, for finding a coloring χ such that disc(F,χ)=O(lgn·herdisc(F)) where herdisc(F) is the hereditary -discrepancy of F. We complement his work by giving a polynomial time algorithm for finding a coloring χ such disc2(F,χ)=O((lgn)1/2·herdisc2(F)) where herdisc2(F) is the hereditary 2-discrepancy of F. Interestingly, our algorithm avoids solving an SDP and instead relies simply on computing eigendecompositions of matrices. To prove that our algorithm has the claimed guarantees, we also prove new inequalities relating both herdisc and herdisc2 to the eigenvalues of the incidence matrix corresponding to F. Our inequalities improve over previous work by Nikolov and Talwar [SODA'15] and by Chazelle and Lvov [SCG'00]. We believe the inequalities are of independent interest as powerful tools for proving hereditary discrepancy lower bounds.

## Hashing-based-Estimators for Kernel Density in High Dimensions

Moses Charikar - Stanford University

Monday, November 20, 2017 - 1:00pm to 2:30pm

Maxwell-Dworkin 221

Given a set of points in d dimensions, imagine putting a Gaussian distribution around each of them. How quickly can we evaluate the sum of these Gaussian densities at a new point? This computational problem (and its generalization for densities other than the Gaussian) is called kernel density estimation. This problem arises as a basic primitive in statistics (non-parametric density estimation), machine learning (kernel methods) and scientific computing (physical simulations).

The batch version of this question (compute the sum of n kernels at m given query points) is addressed by the celebrated fast multiple method from the late 80s which has linear or near-linear complexity for points in 2 and 3 dimensions. The high dimensional case has been challenging because at a high level, typical space partitioning approaches have an exponential dependence on the dimension.

In this talk, I will show that locality sensitive hashing (introduced in the late 90s for the approximate nearest neighbor problem in high dimensions) can be adapted to devise unbiased estimators for kernel density in high dimensions.

Joint work with Paris Siminelakis

## A proof of CSP Dichotomy conjecture

Dmitriy Zhuk - Moscow State University

Monday, November 6, 2017 - 1:00pm to 2:30pm

Maxwell-Dworkin 221

Many natural combinatorial problems can be expressed as constraint satisfaction problems. This class of problems is known to be NP-complete in general, but certain restrictions on the form of the constraints can ensure tractability. The standard way to parameterize interesting subclasses of the constraint satisfaction problem is via finite constraint languages. The main problem is to classify those subclasses that are solvable in polynomial time and those that are NP-complete. It was conjectured that if a constraint language has a weak near unanimity polymorphism then the corresponding constraint satisfaction problem is tractable, otherwise it is NP-complete. The hardness result is known since 2001. We present an algorithm that solves Constraint Satisfaction Problem in polynomial time for constraint languages having a weak near unanimity polymorphism, which proves the remaining part of the conjecture.

## Learning Sums of Independent Commonly Supported Integer Random Variables (SICSIRVs)

Rocco Servedio - Columbia University

Monday, October 23, 2017 - 1:00pm to 2:30pm

Maxwell-Dworkin 221

This talk is about learning Sums of Independent Commonly Supported Integer Random Variables, or SICSIRVs.

For S a finite set of natural numbers, an "S-supported SICSIRV" is a random variable which is the sum of N independent random variables each of which is supported on S. So, for example, if S = {0,1}, then an S-supported SICSIRV is a sum of N independent but not necessarily identically distributed Bernoulli random variables (a.k.a. a Poisson Binomial random variable). Daskalakis et. al. (STOC 2012, FOCS 2013) have shown that for S={0,1,...,k-1}, there is an algorithm for learning an unknown S-SICSIRV using poly(k,1/\epsilon) draws from the distribution and running in poly(k,1/\epsilon) time, independent of N.

In this work we investigate the problem of learning an unknown S-SICSIRV where S={a_1,...,a_k} may be any finite set. We show that

* when |S|=3, it is possible to learn any S-SICSIRV with poly(1/\epsilon) samples, independent of the values a_1,a_2,a_3;

* when |S| >= 4, it is possible to learn any S-SICSIRV with poly(1/\epsilon, \log \log a_k) samples; and

* already when |S|=4, at least \log \log a_k samples may be required for learning.

We thus establish a sharp transition in the complexity of learning S-SICSIRVs when the size of S goes from three to four.

Joint work with Anindya De (Northwestern) and Phil Long (Google).

## Practical Locally Private Heavy Hitters

Uri Stemmer - Harvard University

Monday, September 25, 2017 - 1:00pm to 2:30pm

Maxwell-Dworkin 221

We present new heavy-hitters algorithms satisfying local-differential-privacy, with optimal or near-optimal worst-case error and running time. In our algorithms, the server running time is $\tilde O(n)$ and user running time is $\tilde O(1)$, hence improving on the prior state-of-the-art result of Bassily and Smith [STOC 2015] requiring $O(n^{5/2})$ server time and $O(n^{3/2})$ user time. With a typically large number of participants in local algorithms ($n$ in the millions), this reduction in time complexity is crucial for making locally-private heavy-hitters algorithms usable in practice.

Joint work with Raef Bassily, Kobbi Nissim, and Abhradeep Thakurta.

## On the Quantitative Hardness of CVP

Noah Stephens-Davidowitz - Princeton University

Monday, September 11, 2017 - 1:00pm to 2:30pm

Maxwell-Dworkin 221

For odd integers p >= 1 (and p = \infty), we show that the Closest Vector Problem in the \ell_p norm (CVP_p) over rank n lattices cannot be solved in 2^{(1-\eps) n} time for any constant \eps > 0 unless the Strong Exponential Time Hypothesis (SETH) fails. We then extend this result to almost all'' values of p \geq 1, not including the even integers. This comes tantalizingly close to settling the quantitative time complexity of the important special case of \CVP_2 (i.e., \CVP in the Euclidean norm), for which a 2^{n +o(n)}-time algorithm is known.

We also show a similar SETH-hardness result for SVP_\infty; hardness of approximating CVP_p to within some constant factor under the so-called Gap-ETH assumption; and other hardness results for CVP_p and CVPP_p for any 1 <= p < \infty under different assumptions.