300 likes | 494 Views
A Solution to POJ1811 Prime Test. STYC. Problem Description. Given an integer N which satisfies the relation 2 < N < 2 54 , determine whether or not it is a prime number. If N is not a prime number, find out its smallest prime factor. Key Concepts.
E N D
Problem Description Given an integer N which satisfies the relation 2 < N < 254, determine whether or not it is a prime number. If N is not a prime number, find out its smallest prime factor.
Key Concepts • Prime numbers: A prime number is a positive integer p > 1 that has no positive integer divisors other than 1 and p itself.
Framework of the Solution • Determine whether the given N is prime or not. • If N is prime, print “Prime” and exit. • Factorize N for its smallest prime factor.
The Brute-force WayTrial Division • If N is even, then 2 is its smallest prime factor. • Try dividing N by every odd number k between 2 and N1/2. The smallest k by which N is divisible is the smallest prime factor of N. If such k does not exist, then N is prime. • Complexity: O(N1/2) for time, O(1) for space
Modified Brute-force • Construct a table that stores all prime numbers not greater than N1/2. Try dividing N only by prime numbers. • Complexity: O(N1/2logN) for time, O(N1/2) for space using Sieve of Eratosthenes • Estimation of space consumption: 226 bits = 223 bytes = 8,192 kilobytes • Much time is used in the process of sieving
Modified Brute-force 2 • Embed a table of prime numbers smaller than Nmax1/4 into the source. Extend the table to N1/2 by runtime calculation. • Complexity: O(N3/4/logN) for time, O(N1/2/logN) for space • Estimation of time consumption: Finding all 7,603,553 primes smaller than 227 takes approx. 1.32 x 1011 divisions or 73 minutes on a Pentium 1.5 GHz.
Brute-force with Trick • Start from N1/2 rather than 2. Do factorization recursively once a factor is found. • Efficient in handling N = pq where p and q relatively close to each other. • POJ accepts this! - westever’s solution
Brute-force with Trick 2Wheel Factorization • Test whether N is a multiple of 2, 3 or 5. If it is, the problem has been solved. • If not, do trial division using only integers which are not multiples of 2, 3, and 5. • Saves 7/15 of work.
Key Concepts (cont.) • Prime factorization algorithms: Algorithms devised for determining the prime factors of a given number • Primality tests: Tests to determine whether or not a given number is prime, as opposed to actually decomposing the number into its constituent prime factors
Primality Tests • Deterministic: Adleman-Pomerance-Rumely Primality Test, Elliptic Curve Primality Proving… • Probabilistic: Rabin-Miller Strong Pseudoprime Test…
Rabin-MillerStrong Pseudoprime Test • Given an odd integer N, let N = 2rs + 1 with s odd. Then choose a random integer a between 1 and N - 1. If as = 1 (mod N) or a2^j s = -1 (mod N) for some j between 0 and r - 1, then N passes the test. A prime will pass the test for all a.
Rabin-MillerStrong Pseudoprime Test • Requires no more than (1 + o(1))logN multiplications (mod N). • A number which passes the test is not necessarily prime. But a composite number passes the test for at most 1/4 of the possible bases a. • If n multiple independent tests are performed on a composite number, the probability that it passes each test is 1/4n or less.
Rabin-MillerStrong Pseudoprime Test • Smallest composite numbers passing the RMSPT using the first k primes as bases: 2,047; 1,373,653; 25,326,001; 3,215,031,751; 2,152,302,898,747; 3,474,749,660,383; 341,550,071,728,321, 341,550,071,728,321; at most 41,234,316,135,705,689,041… • 341,550,071,728,321 = 244.957…, 41,234,316,135,705,689,041 = 265.160… • Tests show that randomized bases may fail sometimes.
Pseudocode of RMSPT (Sprache) function powermod(a, s, n) { p := 1 b := a while s > 0 { if s & 1 == 1 then p := p * b % n b := b * b % n s := s >> 1 } }
Pseudocode of RMSPT (cont.) function rabin-miller(n) { if n > 2 AND powermod(2, n - 1, n) != 1 then return FALSE if n > 3 AND powermod(3, n - 1, n) != 1 then return FALSE if n > 5 AND powermod(5, n - 1, n) != 1 then return FALSE if n > 7 AND powermod(7, n - 1, n) != 1 then return FALSE if n > 11 AND powermod(11, n - 1, n) != 1 then return FALSE if n > 13 AND powermod(13, n - 1, n) != 1 then return FALSE if n > 17 AND powermod(17, n - 1, n) != 1 then return FALSE if n > 19 AND powermod(19, n - 1, n) != 1 then return FALSE if n > 23 AND powermod(23, n - 1, n) != 1 then return FALSE return TRUE }
Prime Factorization Algorithms • Continued Fraction Algorithm, Lenstra Elliptic Curve Method, Number Field Sieve, Pollard Rho Method, Quadratic Sieve, Trial Division…
Pollard RhoFactorization Method • Also known as Pollard Monte Carlo factorization method. • Runs at O(p1/2) where p is the largest prime factor of the number to be factored. • Two aspects to this method: iteration and cycle detection.
Pollard RhoFactorization Method Iteration:Iterate the formula xn+1 = xn2 + a (mod N). Almost any polynomial formula (two exceptions being xn2 and xn2 - 2) for any initial value x0 will produce a sequence of numbers that eventually falls into a cycle.
Pollard RhoFactorization Method Cycle detection:Keep one running copy of xi. If i is power of 2, let y = xi, and at each step, compute GCD(|xi - y|, N). If the result is neither 1 nor N, then a cycle is detected and GCD(|xi - y|, N) is a factor (not necessarily prime) of N. If the result is N, the method fails, choose another a and redo iteration.
Pseudocode of RRFM (Sprache) function pollard-rho(n) { do { a := random() } while a == 0 OR a == -2 y := x k := 2 i := 1
Pseudocode of RRFM (cont.) while TRUE { i := i + 1 x := (x * x + a) % n e := abs(x - y) d := GCD(e, n) if d != 1 AND d != n then return d else if i == k then { y := x k := k << 1 } } }
Overall Efficiency Analysis • Time complexity: O(logN) for any prime; O(N1/4) for most composites under average conditions, principally decided by the factorization process • Space complexity: O(1), space demand is always independent of N
Notes on Implementation • Multiplication (mod N) in RMSPT: Requires calculation of 64bit * 64bit % 64bit. Should be computed as binary numbers using “divide and conquer” method. Use floating-point unit for (mod N) operation. Can be optimized by coding in assembly.
Notes on Implementation (cont.) • GCD(a, b) (a > b) in PRFM: Following properties of GCD helps avoiding divisions:If a = b, then GCD(a, b) = a.GCD(a, b) = 2 * GCD(a/2, b/2) with both a and b even.GCD(a, b) = GCD(a/2, b) with a even but b odd.GCD(a, b) = GCD(a - b, b) with both a and b odd. • Time complexity: O(logb)
Notes on Implementation (cont.) • Combination with brute-force algorithms: Embed a prime table. Do brute-force trial division for small divisors. • Minor optimizations: Use 32-bit integer division instead of the 64-bit version when possible…
Actual Timing Performance • Platform: Windows XP SP1 on a Pentium M 1.5 GHz • Algorithms tested: Adapted versions of westever’s, TN’s, lh’s and zsuyrl’s solutions and my later fixed version • Timing method: Process user time returned by the GetProcessTimes Windows API
Actual Timing Performance • Test data: Original data set on POJ, two pseudorandom data sets generated by Mathematica and one hand-made data set (all given in ascending order), new data set on POJ • Verification: Done with Mathematica
Conclusions • Brute-force ways work too slow with integers that have large factors. But they are good compliments to complex methods like Pollard Rho. • Original test data on POJ are too weak to have slow algorithms fail and to prove wrong solutions incorrect. New data have been much better but not yet perfect.