Discrepancy: definitions and applications Basic results: upper/lower bounds Partial Coloring method (non-constructive) SDPs: basic method Algorithmic Spencer’s Result Lovett-Meka result Lower bounds via SDP duality (Matousek)
Classic: Geometric Discrepancy by J. Matousek Classic: Geometric Discrepancy by J. Matousek Papers: Bansal. Constructive algorithms for discrepancy minimization, FOCS 2010 Lovett, Meka. Discrepancy minimization by walking on the edges Survey with fewer technical details: Bansal. …
Study of gaps in approximating the continuous by the discrete. Study of gaps in approximating the continuous by the discrete. Original motivation: Numerical Integration/ Sampling Problem: How well can you approximate a region by discrete points
Study of gaps in approximating the continuous by the discrete. Study of gaps in approximating the continuous by the discrete. Problem: How uniformly can you distribute points in a grid. | (# points in R) - (Area of R) | should be low.
Problem: How uniformly can you distribute points in a grid. Problem: How uniformly can you distribute points in a grid. “Uniform” : For every axis-parallel rectangle R | (# points in R) - (Area of R) | should be low.
With N random samples: Error \prop 1/\sqrt{n} With N random samples: Error \prop 1/\sqrt{n} Quasi-Monte Carlo Methods: \prop Disc/n Can discrepancy be O(1) for 2d grid? No. \Omega(log n) [Schmidt …] d-dimensions: O(log^{d-1} n) [Halton-Hammersely ] \Omega(log^{(d-1)/2} n) [Roth ] \Omega(log^{(d-1)/2 + \eta} n [Bilyk,Lacey,Vagharshakyan’08]
Input: n points placed arbitrarily in a grid. Input: n points placed arbitrarily in a grid. Color them red/blue such that each rectangle is colored as evenly as possible Discrepancy: max over rect. R ( | # red in R - # blue in R | )
Universe: U= [1,…,n] Universe: U= [1,…,n] Subsets: S1,S2,…,Sm Color elements red/blue so each set is colored as evenly as possible. Find : [n] ! {-1,+1} to Minimize |(S)|1 = maxS | i 2 S (i) | If A is m \times n incidence matrix. Disc(A) = min_{x \in {-1,1}^n} |Ax|_\infty
Lovasz-Spencer-Vesztermgombi’86 Lovasz-Spencer-Vesztermgombi’86 Given any matrix A, and x \in R^n can round x to \tilde{x} \in Z^n s.t. |Ax – A\tilde{x}|_\infty < Herdisc(A) Proof: Round the bits one by one.
Can we find it efficiently? Can we find it efficiently? Nothing known until recently. Thm [B’10]. Can efficiently round so that Error \leq O(\sqrt{log m log n}) Herdisc(A)
Bin Packing Bin Packing Refined further by Rothvoss (Entropy rounding method)
N points in a 2-d region. N points in a 2-d region. Weights update over time. Query: Given an axis-parallel rectangle R, determine the total weight on points in R. Preprocess: Low update time (upon weight change)
Line: Line: Query = O(n) Update = 1 Query = 1 Update = O(n^2) Query = 2 Update = O(n) Query = O(log n) Update = O(log n) Recursively can get for 2-d.
Query Query Circles arbitrary rectangles aligned triangle Turns out t_q t_u \geq n^{1/2}/log^2 n ? Larsen-Green: t_q t_u \geq disc(S)^n/log^2 n
A good data structure implies D = A P A = row sparse P = Column sparse (low query time) (low update time)
Random: Color each element i independently as x(i) = +1 or -1 with probability ½ each. Thm: Discrepancy = O (n log n)1/2 Pf: For each set, expect O(n1/2) discrepancy Standard tail bounds: Pr[ | i 2 S x(i) | ¸ c n1/2 ] ¼ e-c2 Union bound + Choose c ¼ (log n)1/2 Analysis tight: Random actually incurs (n log n)1/2).
[Spencer 85]: (Six standard deviations suffice) [Spencer 85]: (Six standard deviations suffice) Always exists coloring with discrepancy · 6n1/2 (In general for arbitrary m, discrepancy = O(n1/2 log(m/n)1/2) Tight: For m=n, cannot beat 0.5 n1/2 (Hadamard Matrix, “orthogonal” sets) Inherently non-constructive proof (pigeonhole principle on exponentially large universe) Challenge: Can we find it algorithmically ? Certain algorithms do not work [Spencer] Conjecture [Alon-Spencer]: May not be possible.
U = [1,…,n] Sets: S1,S2,…,Sm U = [1,…,n] Sets: S1,S2,…,Sm Suppose each element lies in at most t sets (t << n). [Beck Fiala’ 81]: Discrepancy 2t -1. Beck Fiala Conjecture: O(t1/2) discrepancy possible Other results: O( t1/2 log t log n ) [Beck] O( t1/2 log n ) [Srinivasan] O( t1/2 log1/2 n ) [Banaszczyk]
Question: If a set system has low discrepancy (say << n1/2) Question: If a set system has low discrepancy (say << n1/2) Can we find a good discrepancy coloring ? [Charikar, Newman, Nikolov 11]: Even 0 vs. O (n1/2) is NP-Hard (Matousek): What if system has low Hereditary discrepancy? herdisc (U,S) = maxU’ ½ U disc (U’, S|U’) Robust measure of discrepancy (often same as discrepancy) Widely used: TU set systems, Geomety, …
Thm 1: Can get Spencer’s bound constructively. Thm 1: Can get Spencer’s bound constructively. That is, O(n1/2) discrepancy for m=n sets. Thm 2: If each element lies in at most t sets, get bound of O(t1/2 log n) constructively (Srinivasan’s bound) Discrepancy · O(log (mn)) Hereditary discrepancy.
Not clear how to use. Not clear how to use. Linear Program is useless. Can color each element ½ red and ½ blue. Discrepancy of each set = 0! SDPs (LP on vi ¢ vj, cannot control dimension of v’s) | i 2 S vi |2 · n 8 S |vi|2 = 1 Intended solution vi = (+1,0,…,0) or (-1,0,…,0). Trivially feasible: vi = ei (all vi’s orthogonal)
SDP very helpful if “tighter” bounds needed for some sets. SDP very helpful if “tighter” bounds needed for some sets. |i 2 S vi |2 · 2 n | i 2 S’ vi|2 · n/log n |vi|2 · 1 Not apriori clear why one can do this. Entropy Method. Algorithm will construct coloring over time and use several SDPs in the process.
Introduction Introduction The Method Low Hereditary discrepancy -> Good coloring Additional Ideas Spencer’s O(n1/2) bound
Can be improved to O(\sqrt{n})/2^n Can be improved to O(\sqrt{n})/2^n If you pick a random {-1,1} coloring s w.p. say >= ½ |a \cdot s| \leq c \sqrt{n} 2^{n-1} colorings s, with |a\cdot s| \leq c \sqrt{n}
Easy: 1/poly(n) (How?) Easy: 1/poly(n) (How?) Answer: Pick any poly(n) colorings. [Karmarkar-Karp’81]: \approx 1/n^log n Remark: {-1,+1} not enough. Really need color 0 also. E.g. a_1 = 1, a_2=…=a_n = 1/(2n)
There is a {-1,0,1} coloring with at least There is a {-1,0,1} coloring with at least n/2 {-1,1}’s s.t. \sum_i a_i s_i \leq n/2^{n/5} Make buckets of size 2n/2^{n/5} At least 2^{4n/5} sums fall in same bucket Claim: Some two s’ and s’’ in same bucket and differ in at least n/2 coordinates Again consider s = (s’-s’’)/2
Claim: Any set of 2^{4n/5} vertices of the Claim: Any set of 2^{4n/5} vertices of the [Kleitman’66] Isoperimetry for cube. Hamming ball B(v,r) has the smallest diameter for a given number of vertices. |B(v,n/4)| < 2^{4n/5}
Hereditary disc. ) the following SDP is feasible Hereditary disc. ) the following SDP is feasible
Lemma: If g 2 Rn is random Gaussian. For any v 2 Rn, Lemma: If g 2 Rn is random Gaussian. For any v 2 Rn, g ¢ v is distributed as N(0, |v|2) Pf: N(0,a2) + N(0,b2) = N(0,a2+b2) g¢ v = i v(i) gi » N(0, i v(i)2)
Construct coloring iteratively. Construct coloring iteratively. Initially: Start with coloring x0 = (0,0,0, …,0) at t = 0. At Time t: Update coloring as xt = xt-1 + (t1,…,tn) ( tiny: 1/n suffices)
Consider time T = O(1/2) Claim 1: With prob. ½, at least n/2 elements reach -1 or +1. Pf: Each element doing random walk with size ¼ Recall: Random walk with step 1, is ¼ O(t1/2) away in t steps. A Trouble: Various element updates are correlated Consider basic walk x(t+1) = x(t) 1 with prob ½ Define Energy (t) = x(t)2 E[(t+1)] = ½ (x(t)+1)2 + ½ (x(t)-1)2 = x(t)2 + 1 = (t)+1 Expected energy = n at t= n. Claim 2: Each set has O() discrepancy in expectation. Pf: For each S, xt(S) doing random walk with step size ¼
Consider time T = O(1/2) Claim 1: With prob. ½, at least n/2 variables reach -1 or +1. ) Everything colored in O(log n) rounds. Claim 2: Each set has O() discrepancy in expectation per round. ) Expected discrepancy of a set at end = O( log n) Thm: Obtain a coloring with discrepancy O( log (mn)) Pf: By Chernoff, Prob. that disc(S) >= 2 Expectation + O( log m) = O( log (mn)) is tiny (poly(1/m)).
At each step of walk, formulate SDP on unfixed variables. Use some (existential) property to argue SDP is feasible Rounding SDP solution -> Step of walk Properties of walk: High Variance -> Quick convergence Low variance for discrepancy on sets -> Low discrepancy
Spencer’s six std deviations result: Spencer’s six std deviations result: Goal: Obtain O(n1/2) discrepancy for any set system on m = O(n) sets. Random coloring has n1/2 (log n)1/2 discrepancy Previous approach seems useless: Expected discrepancy for a set O(n1/2), but some random walks will deviate by up to (log n)1/2 factor
Partial Coloring Lemma: For any system with m sets, there exists a coloring on ¸ n/2 elements with discrepancy O(n1/2 log1/2 (2m/n)) Partial Coloring Lemma: For any system with m sets, there exists a coloring on ¸ n/2 elements with discrepancy O(n1/2 log1/2 (2m/n)) [For m=n, disc = O(n1/2)] Algorithm for total coloring: Repeatedly apply partial coloring lemma O( n1/2 log1/2 2 ) [Phase 1] + O( (n/2)1/2 log1/2 4 ) [Phase 2] + O((n/4)1/2 log1/2 8 ) [Phase 3] + … = O(n1/2)
Beautiful Counting argument (entropy method + pigeonhole) Beautiful Counting argument (entropy method + pigeonhole) Idea: Too many colorings (2n), but few “discrepancy profiles” Key Lemma: There exist k=24n/5 colorings X1,…,Xk such that every two Xi, Xj are “similar” for every set S1,…,Sn. Some X1,X2 differ on ¸ n/2 positions Consider X = (X1 – X2)/2 Pf: X(S) = (X1(S) – X2(S))/2 2 [-10 n1/2 , 10 n1/2]
There exists a partial coloring with non-uniform discrepancy bound S for set S There exists a partial coloring with non-uniform discrepancy bound S for set S Even if S = ( n1/2) in some average sense
Suppose there exists partial coloring X: Suppose there exists partial coloring X: 1. On ¸ n/2 elements 2. Each set S has |X(S)| · S
Initially write SDP with S = c n1/2 Initially write SDP with S = c n1/2 Each set S does random walk and expects to reach discrepancy of O(S) = O(n1/2) Some sets will become problematic. Reduce their S on the fly. Not many problematic sets, and entropy penalty low.
Construct coloring over time by solving sequence of SDPs (guided by existence results) Construct coloring over time by solving sequence of SDPs (guided by existence results) Can be derandomized [Bansal-Spencer] (use entropy method itself for derandomizing + usual tech.) E.g. Deterministic six standard deviations can be viewed as a way to derandomize something stronger than Chernoff bounds.
How to generate i with required properties. How to generate i with required properties. How to update S over time. Show n1/2 (log log log n)1/2 bound.
Often algorithms rely on continuous relaxations. Often algorithms rely on continuous relaxations. - Linear Program is useless. Can color each element ½ red and ½ blue.
Improved results of Spencer, Beck, Srinivasan, … based on clever counting (entropy method). - Pigeonhole Principle on exponentially large systems (seems inherently non-constructive)
Suppose we have discrepancy bound S for set S. Suppose we have discrepancy bound S for set S. Consider 2n possible colorings Signature of a coloring X: (b(S1), b(S2),…, b(Sm)) Want partial coloring with signature (0,0,0,…,0)
Energy increases at each step: Energy increases at each step: E(t) = \sum_i x_i(t)^2 Initially energy =0, can be at most n. Expected value of E(t) = E(t-1) + \sum_i \gamma_i(t)^2 Markov’s inequality.
How to generate the \eta_i How to generate the \eta_i How to update \Delta_S over time
If exist two colorings X1,X2 If exist two colorings X1,X2 1. Same signature (b1,b2,…,bm) 2. Differ in at least n/2 positions. Consider X = (X1 –X2)/2 -1 or 1 on at least n/2 positions, i.e. partial coloring Has signature (0,0,0,…,0) X(S) = (X1(S) – X2(S)) / 2, so |X(S)| · S for all S. Can show that there are 24n/5 colorings with same signature. So, some two will differ on > n/2 positions. (Pigeon Hole)
Partial Coloring Lemma: For any system with m sets, Partial Coloring Lemma: For any system with m sets, there exists a coloring on ¸ n/2 elements with discrepancy O(n1/2 log1/2 (2m/n)) [For m=n, disc = O(n1/2)] Algorithm for total coloring: Repeatedly apply partial coloring lemma Total discrepancy O( n1/2 log1/2 2 ) [Phase 1] + O( (n/2)1/2 log1/2 4 ) [Phase 2] + O((n/4)1/2 log1/2 8 ) [Phase 3] + … = O(n1/2)
Pf: Associate with coloring X, signature = (b1,b2,…,bn) (bi = bucket in which X(Si) lies ) Wish to show: There exist 24n/5 colorings with same signature Choose X randomly: Induces distribution on signatures. Entropy () · n/5 implies some signature has prob. ¸ 2-n/5. Entropy ( ) · i Entropy( bi) [Subadditivity of Entropy] bi = 0 w.p. ¼ 1- 2 e-50, = 1 w.p. ¼ e-50 = 2 w.p. ¼ e-450 ….
Partial coloring with non-uniform discrepancy S for set S Partial coloring with non-uniform discrepancy S for set S
Partial Coloring: S ¼ 10 n1/2 gives low entropy ) 24n/5 colorings exist with same signature. ) some X1,X2 with large hamming distance. (X1 – X2) /2 gives the desired partial coloring. Trouble: 24n/5/2n is an exponentially small fraction.
Do'stlaringiz bilan baham: |