1 / 36

Discrete Math CS 2800

Discrete Math CS 2800. Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part e). 1) The Probabilistic Method 2) Randomized Algorithms. The Probabilistic Method. 2. The Probabilistic Method. Method for providing non-constructive existence proofs:

sdoran
Download Presentation

Discrete Math CS 2800

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discrete MathCS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part e) 1) The Probabilistic Method 2) Randomized Algorithms

  2. The Probabilistic Method 2

  3. The Probabilistic Method • Method for providing non-constructive existence proofs: • Thm. If the probability that a randomly selected element of the set S does • not have a particular property is less than 1, then there exists an • element in S with this property. • Alternatively: If the probability that a random element of S has a • particular property is larger than 0, then there exists at least one • element with that property in S. • Note: We saw an earlier example of the probabilistic method when discussing the 7/8 alg. for 3-CNF.

  4. Example: Lower bound for Ramsey numbers • Recall the definition of Ramsey number R(k,k): • Let R(k,k) be the minimal n such that if the edges of the complete • graph on n nodes are colored Red and Blue, then there either is a • complete subgraph of k nodes with all edges Red or a complete • subgraph of k nodes with all edges Blue. • R(3,3) = 6. So, any complete 6 node graphs has either a Red or • a Blue triangle. (Proof: see “party problem”.)

  5. Let’s say she knows 3 others. Consider one person. Reminder: “The party problem” Dinner party of six:Either there is a group of 3 who all know each other, or there is a group of 3 who are all strangers. By contradiction. Assume we have a party of six where no three people all know each other and no three people are all strangers. If any of those 3 know each other, we have a blue , which means 3 people know each other. Contradicts assumption. So they all must be strangers. But then we have three strangers. Contradicts assumption. She either knows or doesn’t know each other person. The case where she doesn’t know 3 others is similar. Also, leads to constradiction. So, such a party does not exist! QED But there are 5 other people! So, she knows, or doesn’t know, at least 3 others. (GPH)

  6. How do we get a lower bound on R(k,k)? E.g., lower bound for R(3,3)? We need to find an n such thatthe complete graph on n nodes with a Red/Blue edge coloring doesnotcontain a Red or a Blue triangle. E.g. So, R(3,3) > 5. I.e., we have a lower bound on the Ramsey Number for k=3. Can we do this for R(k,k) in general? Very difficult to construct the colorings, but… we can prove they exist for non-trivial k.

  7. Why? Unbiased coloring is not essential. Only matters that each possible coloring has some non-zero probability. • Thm. For k ≥ 4, R(k,k) ≥ 2k/2 • So, e.g., k = 20, then there exists a Red/Blue coloring of the complete graph with 1023 nodes that does not have any complete monochromatic sub graph of size 20. (But we have no idea of how to find such a coloring!) • Proof:Consider a sample space where each possible coloring of • the n-node complete graph is equally likely. A sample coloring • can be obtained by randomly coloring each edge. I.e., with • probability ½ set edge to Blue, otherwise Red. • Let n < 2k/2 • Aside: each particular coloring has probability (1/2) (n * (n-1) /2) We want to show that there is a larger than 0 probability of getting a coloring with no monochromatic k-clique.

  8. Consider a subset of k nodes. There will be of such subsets, S_1, S_2, … S_ . Let E_i event that S_i is a monochromatic subgraph (a Red or a Blue clique). So, a sample is a randomly selected graph coloring. What is the probability of E_i? Why? Note: number of edges in clique of size k is:

  9. So, the probability that the randomly selected coloring will have some monochromatic k-clique is: If we can show that this probability is strictly less than 1, then we know that there must exist some coloring that does not have any monochromatic k-clique! We’ll do this. It’s mainly a matter of rewriting the combinatorial expression for the probability and finding an upper bound. First step: Can we just add up the individual probabilities?

  10. But, computing the exact value of is very tricky. Why?? Events are not disjoint and not independent! A coloring can have multiple monochromatic cliques. Also, having e.g. several Blue cliques in part of the graph, makes it more likelyto have more Blue cliques on other cliques that share many edges with the Blue cliques. So, all kinds of subtle dependencies! Would need to use the inclusion-exclusion formula (many terms)! Fortunately, we “just” need to upper bound the probability. So, we can truncate the formula to the first set of terms. This gives us Boole’s Inequality: p(E1 U E2) ≤p(E1) + p(E2) General form, see exercise 15, sect. 6.2. Why? Since: p(E1 U E2) =p(E1) + p(E2) - p(E1 E2)

  11. We want: So, we get: < 1  So, we have “upper bounded” our probability. What’s left? We need to show that the left hand side is strictly less than 1. “Just” combinatorics… O(nk) Note: We have many terms in the sum, but p(E_i) `s can each be very small.

  12. Ex. 17, 5.4 by assumption: by assumption: So,

  13. Note: Proof is non-constructive. No effective method known for finding such coverings! So, when and So, is the probability of having a monochromatic k-clique is strictly less than 1. Therefore, there must exist edge colorings in our sample space that do not have such monochromatic k-cliques! So, R(k,k) >= 2k/2 QED

  14. Probabilistic Algorithms 14

  15. Monte Carlo Algorithm • Probabilistic Algorithms --- algorithms that make random choices at one • or more steps. • Decision problems --- problems for which the answer is True or False. • E.g., is this number composite? • Monte Carlo algorithms --- there is a small probability that they will • return an incorrect answer. In general, by running the algorithm • longer, we can make the probability of error arbitrary small.

  16. Example I: Probabilistic Primality TestMiller’s Test • Let n be a positive integer and let n-1 =2st, where s is a non-negative integer (takes out factors of 2) and t is an odd positive integer. • E.g. n = 2047. n – 1 = 2046 = 2 x 1023. So, s = 1 and t = 1023. • n passes the Miller test for (likely) primality for the base b (1<b<n) if either • bt≡ 1 (mod n) or With base b =2, we have  but 2047 = 23 x 89 All primes pass the Miller test. So, if test fails, then n is composite. Unfortunately, also, a compositeintegernmay pass Miller’s test, but for fewer than n/4 bases b, with 1 < b < n.

  17. Example: Determine if n = 221 is prime. We have n − 1 = 220 = 22·55. So s = 2 and t = 55. Randomly select a b < n, e.g. b = 174. We compute: btmod n = 17455 mod 221 = 47 ≠ 1 b20·t mod n = 17455 mod 221 = 47 ≠ n − 1 b21·t mod n = 174110 mod 221 = 220 = n − 1. hmm… Since 220 ≡ -1 mod n, either 221 is prime, or 174 is strong liar for 221. We try another random b, e.g. 137: btmod n = 13755 mod 221 = 188 ≠ 1 b20·t mod n = 13755 mod 221 = 188 ≠ n − 1 b21·t mod n = 137110 mod 221 = 205 ≠ n − 1. So, base 137 is a witness for the compositeness of 221. And 174 was in fact a strong liar. Note that this tells us nothing about the factors of 221, which are 13 and 17.

  18. Using Miller’s Test • Goal is to decide the question – “Is n composite?” • Algorithm: • Randomly pick a number 1 < b < n. • Does n pass the Miller test for b? • If n fails the test, then n is a composite number; answer is True; b is known as a witness for the compositeness, and the test STOPS. • Otherwise the answer is “unknown” • Repeat step 1&2) k times until the required certainty is achieved. • If after k iterations, if n is not found to be a composite number, then • it can be declared probably prime (each time through with outcome • “unknown”, we have some additional evidence for primality).

  19. Observations: • Algorithm can only make “mistakes” when it returns “unknown.” • The probability that a composite integer n passes the Miller’s test for a • randomly selected base bis less than 1/4. • By repeating test k times, given that the iterations are independent, the • probability that n is composite but the algorithm has not found a witness • for compositionality is less than (¼)k.

  20. Question: But, do we really need to select b at random in the algorithm? Why not just try b’s starting from 1 through n? This might be just fine… Should work for 75% of the b’s… Perhaps a deterministic algorithm works just fine after all! Hmm… It’s a subtle but key issue. The 25% of “bad bases” may all cluster together starting at b=1. How long would the alg. take to find a correct base? Still poly time? Actually, exponential in the length of the input!  (remember: we measure run time in terms of the number of digits to represent n.) Randomization fundamentally gets around this: For any composite n, wherever the strong liars are, after k guesses we will have probability <= (1/4)^k that we did only used “liars”. E.g. (1/4)^30 ~ 10-18, even independent of number of digits of n itself! While for fixed scheme: n = 10,000, we may need to try 2,500 bases! It’s hard to fool a randomized algorithm!!

  21. By taking k sufficient large, we can make the probability of error quite small. • 10 iterations  less than 1 in 106 • 30 iterations  less than 1 in 1018 • Key Point: • In general, we can easily boost our confidence in the outcome of a • randomized procedure by doing repeated runs. • Compare to: taking more samples in statistics. • Note 1: If we use n as prime as one of the two primes used in the RSA • cryptosystem and n is actually a composite, the procedures used to decrypt • messages will not produce the original encrypted message. They key is • then discarded and two new possible primes are used.

  22. Note 2: The simplest probabilistic primality test is the Fermat primality test. • (discussed earlier) It is only a heuristic test; some composite numbers • (Carmichael numbers) will be declared "probably prime" no matter what • witness is chosen. Nevertheless, it is sometimes used if a rapid screening of • numbers is needed, for instance in the key generation phase of the RSA public • key cryptographical algorithm. • Note 3: The Miller-Rabin primality test is a more sophisticated variant which • detect all composites (this means: for every composite number n, at least 3/4 • (Miller-Rabin) or 1/2 (Solovay-Strassen) of numbers b are witnesses of • compositeness of n. They are often the methods of choice, as they are much • faster than other general primality tests.

  23. Example II: Finding The Median--- Randomized Divide and Conquer Problem: Given a set of N distinct numbers, what is the median value? Example with N = 11: 14, 18, 12, 19, 3, 4, 7, 15, 5, 11, 8 Median? Runtime?? O(N log(N)) 3, 4, 5, 7, 8, 11, 12, 14, 15, 18, 19 Can we do better? Hmmm… What about finding minimum element or max element?? A: 11 O(N)

  24. The median is the “middle position.” More precisely, given a set S={a_1, a_2, a_3, …, a_n}, the median is the kth largest element in S, with k = (n+1)/2 if n is odd, and k = n / 2 if n is even. Note: let’s assume all numbers are distinct. Let’s consider a more clever algorithm for finding the kth largest element of a set S. Note: 1st largest is the minimum, and nth largest the max element. Let’s try a “divide (of S) – and – conquer” strategy.

  25. Select(S, k): Choose a splitter a_iS={a_1, a_2, a_3, …, a_n}, For each element a_j of S Put a_j in S- if a_j < a_i Put a_j in S+ if a_j > a_i Endfor If |S-| = k-1 then The splitter a_i was in fact the desired answer Else if |S-| >= k then The kth largest element lies in S- Recursively call Select(S-, k) Else suppose |S-| = l< k-1 The kth largest element lies in S+ Recursively callSelect(S+, k-1-l) Endif Note: a_i values not ordered. [on board]

  26. Analysis • Correctness • Run time (randomized) We’ll get: O(n)  So, no need to sort first after all!

  27. Correctness • Termination: yes, always calls on strictly smaller set. • Correctness: • By induction on n, the size of S. • Base case: |S| = 1. • Select(S,1) returns the right value. • Induction step: Assume recursive calls Select(S’, x) (with on • smaller sets) correctly return the x^th largest value. • From the code and the induction assumption, we see that • Select(S,k) correctly returns the k^th largest value.

  28. Aside: What happens if we consistently picked the “worst” splitter? Run time • First, some intuition. Consider the “splitter”. • What would we like a good splitter to do? • Note that after the splitter, we proceed with one of the two • recursive calls (on S- or S+). • So, we want the splitter to reduce the size of the set in the • recursive call as much as possible. We don’t know in advance • whether the call is on S- or S+. So, let’s make them both as • small as possible… • Best splitter in the middle: pick the median. Hmm. 

  29. Run time cont. • Fortunately, our splitter does not have to be THAT good. • Let’s assume each of the sets S- and S+ have at least ε n • fewer elements than S (of size n). (0< ε <=1) • So, those sets are each of size at most (1 –ε) n • Assume, work in Select without the recursive calls is linear time, • i.e., c.nfor some constant c. • Now, run time is bounded by:

  30. with Let’s “unwind” this recursion: So, run time linear in n !(smaller bigger constant factor) “divide-and-conquer” worked! Remaining issue?

  31. But, how do we get a reasonable splitter? In particular, we • can’t spend much time on finding a splitter… • Randomness and probability to the recue! • We will show that simply picking a random element • from S to split on will work just fine! • Let’s say we want to get an ε = 1/4. Then, we want to • avoid the bottom 1/4 elements and the top 1/4 elements. • On average (in expectation), how many random guesses • do we need to get a good (in middle half of elements) splitter? A: 2. Geom. r.v. with p = 1/2

  32. A bit more formally Let’s define “phases” in the run of the algorithm. We say the alg. is in phasej if the set under consideration is at most of size but greater than Note that a “successful splitter” --- within the middle half elements --- cuts the size of the set by at least 3/4.

  33. Let r.v. X be the number of steps of the algorithm. So, where r.v. X_jis the number of steps in phase j. The expected time spend in phase j is for some constant c. Why? Time in one iteration: and, in expectation 2 iterations per phase.

  34. Putting it together: We have for the total run time So, We have proved: Thm. The expected run time of Select(S,k) is O(n). Note 1: Clever randomization and divide-and-conquer, eliminated the need for sorting! O(n log n) Note 2: Incidentally, similar strategy and analysis gives us Quicksort(“calculating rank of all numbers”) with expected run time O(n log n).

  35. Fortune’s Formula: a good read about probabilities and the math behind “gambling” and investing. • Probability theory is a very rich topic of increasing • importance in computer science (algorithms, AI, • machine learning, and data mining), but for now, this • ends our probabilistic adventures. Example story: How to win a Nobel prize, lose your life savings, and go $20M in depth, all in the same year!!

  36. This concludes our journey through Discrete Mathematics: from logic and proofs, through sets, functions, sequences, and induction, through algorithms and growth rates, through number theory, and, finally, through probability and randomization!! THE END!! 

More Related