# Makrand Sinha

Networks & Optimization Group
Centrum Wiskunde & Informatika
Science Park 123
1098 XG Amsterdam
NETHERLANDS

Email: Makrand.Sinha then the at sign followed by cwi nl separated by dots

I am a Simons-Berkeley postdoctoral fellow at the Simons Institute at UC Berkeley. Previously, I was a postdoctoral researcher in the Networks and Optimization group at CWI in Amsterdam.

My research is in the area of theoretical computer science. My primary research interests lie in the foundations of quantum and classical computation and optimization, and specifically in understanding the relative power of quantum vs classical algorithms and communication protocols, understanding limitations of various approaches in optimization such as Linear or Semidefinite Programs, and designing algorithms for various optimization problems.

I received my PhD in August 2018 from the Paul G. Allen School of Computer Science & Engineering at University of Washington in Seattle under the guidance of Anup Rao.

My Curriculum Vitae: CV [PDF]

## Publications

• $$k$$-Forrelation Optimally Separates Quantum and Classical Query Complexity
Nikhil Bansal and Makrand Sinha
To appear in STOC '21. Contributed Talk at QIP '21.
• [arXiv] [ECCC] [Video] [Abstract +]

Aaronson and Ambainis (SICOMP 18) showed that any partial function on $$N$$ bits that can be computed with an advantage $$\delta$$ over a random guess by making $$q$$ quantum queries, can also be computed classically with an advantage $$\delta/2$$ by a randomized decision tree making $${O}_q(N^{1-\frac{1}{2q}}\delta^{-2})$$ queries. Moreover, they conjectured the $$k$$-Forrelation problem --- a partial function that can be computed with $$q = \lceil k/2 \rceil$$ quantum queries --- to be a suitable candidate for exhibiting such an extremal separation.

We prove their conjecture by showing a tight lower bound of $$\widetilde{\Omega}(N^{1-1/k})$$ for the randomized query complexity of $$k$$-Forrelation, where the advantage $$\delta = 2^{-O(k)}$$. By standard amplification arguments, this gives an explicit partial function that exhibits an $$O_\epsilon(1)$$ vs $$\Omega(N^{1-\epsilon})$$ separation between bounded-error quantum and randomized query complexities, where $$\epsilon>0$$ can be made arbitrarily small. Our proof also gives the same bound for the closely related but non-explicit $$k$$-Rorrelation function introduced by Tal (FOCS 20).

Our techniques rely on classical Gaussian tools, in particular, Gaussian interpolation and Gaussian integration by parts, and in fact, give a more general statement. We show that to prove lower bounds for $$k$$-Forrelation against a family of functions, it suffices to bound the $$\ell_1$$-weight of the Fourier coefficients between levels $$k$$ and $$(k-1)k$$. We also prove new interpolation and integration by parts identities that might be of independent interest in the context of rounding high-dimensional Gaussian vectors.

• Majorizing Measures for the Optimizer
Sander Borst, Daniel Dadush, Neil Olver and Makrand Sinha
Appeared in ITCS '21.
• [arXiv] [Video] [Abstract +]

The theory of majorizing measures, extensively developed by Fernique, Talagrand and many others, provides one of the most general frameworks for controlling the behavior of stochastic processes. In particular, it can be applied to derive quantitative bounds on the expected suprema and the degree of continuity of sample paths for many processes.

One of the crowning achievements of the theory is Talagrand’s tight alternative characterization of the suprema of Gaussian processes in terms of majorizing measures. The proof of this theorem was difficult, and thus considerable effort was put into the task of developing both shorter and easier to understand proofs. A major reason for this difficulty was considered to be theory of majorizing measures itself, which had the reputation of being opaque and mysterious. As a consequence, most recent treatments of the theory (including by Talagrand himself) have eschewed the use of majorizing measures in favor of a purely combinatorial approach (the generic chaining) where objects based on sequences of partitions provide roughly matching upper and lower bounds on the desired expected supremum.

In this paper, we return to majorizing measures as a primary object of study, and give a viewpoint that we think is very natural and clarifying from an optimization perspective. As our main contribution, we give an algorithmic proof of the majorizing measures theorem based on two parts:

- We make the simple (but apparently new) observation that finding the best majorizing measure can be cast as a convex program. This also allows for efficiently computing the measure using off-the-shelf methods from convex optimization.

- We obtain tree-based upper and lower bound certificates by rounding, in a series of steps, the primal and dual solutions to this convex program.

While duality has conceptually been part of the theory since its beginnings, as far as we are aware no explicit link to convex optimization has been previously made.

• Online Discrepancy Minimization for Stochastic Arrivals
Nikhil Bansal, Haotian Jiang, Raghu Meka, Sahil Singla and Makrand Sinha
Appeared in SODA '21.
• [arXiv] [Video] [Abstract +]

In the stochastic online vector balancing problem, vectors $$v_1,v_2,\ldots,v_T$$ chosen independently from an arbitrary distribution in $$\mathbb{R}^n$$ arrive one-by-one and must be immediately given a $$\pm$$ sign. The goal is to keep the norm of the discrepancy vector, i.e., the signed prefix-sum, as small as possible for a given target norm.

We consider some of the most well-known problems in discrepancy theory in the above online stochastic setting, and give algorithms that match the known offline bounds up to $$\mathrm{polylog}(nT)$$ factors. This substantially generalizes and improves upon the previous results of Bansal, Jiang, Singla, and Sinha (STOC' 20). In particular, for the Komóls problem where $$\|v_t\|_2\leq 1$$ for each $$t$$, our algorithm achieves $$\widetilde{O}(1)$$ discrepancy with high probability, improving upon the previous $$\widetilde{O}(n^{3/2})$$ bound. For Tusnády's problem of minimizing the discrepancy of axis-aligned boxes, we obtain an $$O(\log^{d+4} T)$$ bound for arbitrary distribution over points. Previous techniques only worked for product distributions and gave a weaker $$O(\log^{2d+1} T)$$ bound. We also consider the Banaszczyk setting, where given a symmetric convex body $$K$$ with Gaussian measure at least $$1/2$$, our algorithm achieves $$\widetilde{O}(1)$$ discrepancy with respect to the norm given by $$K$$ for input distributions with sub-exponential tails.

Our results are based on a new potential function approach. Previous techniques consider a potential that penalizes large discrepancy, and greedily chooses the next color to minimize the increase in potential. Our key idea is to introduce a potential that also enforces constraints on how the discrepancy vector evolves, allowing us to maintain certain anti-concentration properties. We believe that our techniques to control the evolution of states could find other applications in stochastic processes and online algorithms. For the Banaszczyk setting, we further enhance this potential by combining it with ideas from generic chaining. Finally, we also extend these results to the setting of online multi-color discrepancy.

• Online Vector Balancing and Geometric Discrepancy
Nikhil Bansal, Haotian Jiang, Sahil Singla and Makrand Sinha
Appeared in STOC '20. Invited talk at TCS+.
• [arXiv] [Video] [Abstract +]

We consider an {online vector balancing} question where $$T$$ vectors, chosen from an {arbitrary} distribution over $$[-1,1]^n$$, arrive one-by-one and must be immediately given a $$\pm$$ sign. The goal is to keep the {discrepancy}---the $$\ell_{\infty}$$-norm of any signed prefix-sum---as small as possible. A concrete example of this question is the {online interval discrepancy} problem where $$T$$ points are sampled one-by-one uniformly in the unit interval $$[0,1]$$, and the goal is to immediately color them $$\pm$$ such that every sub-interval remains always nearly balanced. As random coloring incurs $$\Omega(T^{1/2})$$ discrepancy, while the offline bounds are $$\Theta((n\log T)^{1/2})$$ for vector balancing and $$1$$ for interval balancing, a natural question is whether one can (nearly) match the offline bounds in the online setting for these problems. One must utilize the stochasticity as in the worst-case scenario it is known that discrepancy is $$\Omega(T^{1/2})$$ for any online algorithm.

In a special case of online vector balancing, Bansal and Spencer (arXiv '19)} recently show an $$O(\sqrt{n}\log T)$$ bound when each coordinate is independently chosen. When there are dependencies among the coordinates, as in the interval discrepancy problem, the problem becomes much more challenging, as evidenced by a recent work of Jiang, Kulkarni, and Singla (arXiv '19) that gives a non-trivial $$O(T^{1/\log\log T})$$ bound for online interval discrepancy. Although this beats random coloring, it is still far from the offline bound.

In this work, we introduce a new framework that allows us to handle online vector balancing even when the input distribution has {dependencies} across coordinates. In particular, this lets us obtain a $$\mathrm{poly}(n, \log T)$$ bound for online vector balancing under arbitrary input distributions, and a $$\mathrm{polylog} (T)$$ bound for online interval discrepancy. Our framework is powerful enough to capture other well-studied geometric discrepancy problems; e.g., we obtain a $$\mathrm{poly}(\log^d (T))$$ bound for the online $$d$$-dimensional Tusnády's problem. All our bounds are tight up to polynomial factors.

A key new technical ingredient in our work is an anti-concentration inequality for sums of pairwise uncorrelated random variables, which might also be of independent interest.

• Exponential Separation between Quantum Communication and Logarithm of Approximate Rank
Makrand Sinha and Ronald de Wolf
Appeared in FOCS '19. Contributed talk at QIP '20 (as part of a joint submission).
• [arXiv] [ECCC] [Video] [Abstract +]

Chattopadhyay, Mande and Sherif (ECCC 2018) recently exhibited a total Boolean function, the sink function, that has polynomial approximate rank and polynomial randomized communication complexity. This gives an exponential separation between randomized communication complexity and logarithm of the approximate rank, refuting the log-approximate-rank conjecture. We show that even the quantum communication complexity of the sink function is polynomial, thus also refuting the quantum log-approximate-rank conjecture.

Our lower bound is based on the fooling distribution method introduced by Rao and Sinha (ECCC 2015) for the classical case and extended by Anshu, Touchette, Yao and Yu (STOC 2017) for the quantum case. We also give a new proof of the classical lower bound using the fooling distribution method.

• Simplified Separation of Information and Communication
Anup Rao and Makrand Sinha
Appeared in Theory of Computing, 2018.
• [Journal] [ECCC] [Abstract +]

We give an example of a boolean function whose information complexity is exponentially smaller than its communication complexity. This was first proven recently by Ganor, Kol and Raz (J. ACM 2016) and our work gives a simpler proof of the same result. In the course of this simplification, we make several new contributions: we introduce a new communication lower bound technique, the notion of a fooling distribution, which allows us to separate information and communication complexity, and we also give a more direct proof for the information complexity upper bound.

We also prove a generalization of Shearer's Lemma that may be of independent interest. A version of Shearer's original lemma bounds the expected mutual information of a subset of random variables with another random variable, when the subset is chosen independently of all the random variables that are involved. Our generalization allows some dependence between the random subset and the random variables involved, and still gives us similar bounds with an appropriate error term.

• Edge Estimation with Independent Set Oracles
Paul Beame, Sariel Har-Peled, Sivaramakrishnan Natarajan Ramamoorthy, Cyrus Rashtchian and Makrand Sinha
Appeared in ITCS '18. Full version in ACM Transactions on Algorithms, 2020.
• [arXiv] [Abstract +]

We study the problem of estimating the number of edges in a graph with access to only an independent set oracle. Independent set queries draw motivation from group testing and have applications to the complexity of decision versus counting problems. We give two algorithms to estimate the number of edges in an $$n$$-vertex graph: one that uses only $$\mathrm{polylog}(n)$$ bipartite independent set queries, and another one that uses $${n}^{2/3} \cdot \mathrm{polylog}(n)$$ independent set queries.

• Lower Bounds for Approximating the Matching Polytope
Makrand Sinha
Appeared in SODA '18. Invited talk at ISMP '18.
• [arXiv] [ECCC] [Abstract +]

We prove that any extended formulation that approximates the matching polytope on $$n$$-vertex graphs up to a factor of $$(1+\epsilon)$$ for any $$\frac2n \le \epsilon \le 1$$ must have at least $${n} \choose {{\alpha}/{\epsilon}}$$ defining inequalities where $$0<\alpha<1$$ is an absolute constant. This is tight as exhibited by the $$(1+\epsilon)$$ approximating linear program obtained by dropping the odd set constraints of size larger than $$({1+\epsilon})/{\epsilon}$$ from the description of the matching polytope. Previously, a tight lower bound of $$2^{\Omega(n)}$$ was only known for $$\epsilon = O\left(\frac{1}{n}\right)$$ [Rothvoss, STOC '14; Braun and Pokutta, IEEE Trans. Information Theory '15] whereas for $$\frac2n \le \epsilon \le 1$$, the best lower bound was $$2^{\Omega\left({1}/{\epsilon}\right)}$$ [Rothvoss, STOC '14]. The key new ingredient in our proof is a close connection to the non-negative rank of a lopsided version of the unique disjointness matrix.

• A Direct-sum Theorem for Read-Once Branching Programs
Anup Rao and Makrand Sinha
Appeared in RANDOM '16.
• [pdf] [Abstract +]

We study a direct-sum question for read-once branching programs. If $$M(f)$$ denotes the minimum average memory required to compute a function $$f(x_1,x_2, \dotsc, x_n)$$ how much memory is required to compute $$f$$ on $$k$$ independent inputs that arrive in parallel? We show that when the inputs (updates) are sampled independently from some domain $$\mathcal{X}$$ and $$M(f) = \Omega(n)$$, then computing the value of $$f$$ on $$k$$ streams requires average memory at least $$\Omega\left(k \cdot \frac{M(f)}{n}\right)$$.
Our results are obtained by defining new ways to measure the information complexity of read-once branching programs. We define two such measures: the transitional and cumulative information content. We prove that any read-once branching program with transitional information content $$\mathtt{I}$$ can be simulated using average memory $$\mathcal{O}(n(\mathtt{I}+1))$$. On the other hand, if every read-once branching program with cumulative information content $$\mathtt{I}$$ can be simulated with average memory $$\mathcal{O}(\mathtt{I}+1)$$, then computing $$f$$ on $$k$$ inputs requires average memory at least $$\Omega(k \cdot (M(f)-1))$$.

• Fooling Pairs in Randomized Communication Complexity
Shay Moran, Makrand Sinha and Amir Yehudayoff
Appeared in SIROCCO '16.
• [ECCC] [Abstract +]

Fooling pairs are one of the standard methods for proving lower bounds for deterministic two-player communication complexity. We study fooling pairs in the context of randomized communication complexity. We show that every fooling pair induces far away distributions on transcripts of private-coin protocols. We then conclude that the private-coin randomized $$\varepsilon$$-error communication complexity of a function $$f$$ with a fooling set $$\mathcal{S}$$ is at least order $$\log \frac{\log |\mathcal{S}|}{\varepsilon}$$. This is tight, for example, for the equality and greater-than functions.

• On the Communication Complexity of Greater-Than
Sivaramakrishnan Natarajan Ramamoorthy and Makrand Sinha
Appeared in Allerton '15.
• [pdf] [Abstract +]

We give a simple information theoretic proof that the public-coin randomized communication complexity of the greater-than function is $$\Omega(\log n)$$ for bit-strings of length $$n$$.

• Constructing a Pseudorandom Generator Requires an Almost Linear Number of Calls
Thomas Holenstein and Makrand Sinha
In proceedings of FOCS '12.
• [arXiv] [Abstract +]

We show that a black-box construction of a pseudorandom generator from a one-way function needs to make $$\Omega(n/log(n))$$ calls to the underlying one-way function. The bound even holds if the one-way function is guaranteed to be regular. In this case it matches the best known construction due to Goldreich, Krawczyk, and Luby (SIAM J. Comp. 22, 1993), which uses $$\mathcal{O}(n/log(n))$$ calls.

• Vertices of Degree $$k$$ in Random Unlabeled Trees
Konstantinos Panagiotou and Makrand Sinha
In proceedings of EuroComb '09. Full version appeared in Journal of Graph Theory, 2012.
• [pdf] [Abstract +]

Let $$\mathcal{H}_n$$ be the class of unlabeled trees with $$n$$ vertices, and denote by $$\mathcal{H}_n$$ a tree that is drawn uniformly at random from this set. The asymptotic behavior of the random variable $$\deg_k(\mathcal{H}_n)$$ that counts vertices of degree $$k$$ in $$\mathcal{H}_n$$ was studied, among others, by Drmota and Gittenberger, who showed that this quantity satisfies a central limit theorem. This result provides a very precise characterization of the central region'' of the distribution, but does not give any non-trivial information about its tails.

In this work we study further the number of vertices of degree $$k$$ in $$\mathcal{H}_n$$. In particular, for $$k = \mathcal{O}\left(\sqrt{\frac{\log n}{\log\log n}}\right)$$ we show exponential-type bounds for the probability that $$\deg_k(\mathcal{H}_n)$$ deviates from its expectation. On the technical side, our proofs are based on the analysis of a randomized algorithm that generates unlabeled trees in the so-called Boltzmann model. The analysis of such algorithms is quite well-understood for classes of labeled graphs, see e.g. the work by Bernasconi, the first author, and Steger. Comparable algorithms for unlabeled classes are unfortunately much more complex. We demonstrate in this work that they can be analyzed very precisely for classes of unlabeled graphs as well.