530 likes | 822 Views
Mehrdad Nourani. Data & Network Security. Hash Algorithms. Session 14. Well-known Hash Functions. Hash Algorithms. see similarities in the evolution of hash functions & block ciphers increasing power of brute-force attacks leading to evolution in algorithms
E N D
Mehrdad Nourani Data & Network Security
Hash Algorithms Session 14
Hash Algorithms • see similarities in the evolution of hash functions & block ciphers • increasing power of brute-force attacks • leading to evolution in algorithms • from DES to AES in block ciphers • from MD4 & MD5 to SHA-1 & RIPEMD-160 in hash algorithms • likewise tend to use common iterative structure as do block ciphers
MD5 • designed by Ronald Rivest (the R in RSA – Rivest-Shamir-Adleman) • latest in a series of MD2, MD4 • produces a 128-bit hash value • until recently was the most widely used hash algorithm • in recent times have both brute-force & cryptanalytic concerns • specified as Internet standard RFC1321
MD5 Overview • Step 1: pad message so that we have: length mod 512 = 448 or equivalently length ≡ 448 (mod 512) • The above makes the length of padded message to be 64 bits less than an integer multiple of 512 bits. • Padding is always added even if the message is already of the desired length. e.g. if the message is 448 bits long, it is padded by 512 bits to a length of 960 bits. • Number of padding bits is in range of 1 to 512 bits. • Padding is a single “1” followed by the necessary number of “0”s • Step 2: append a 64-bit length value to message • This is K mod 264 where k is the length of message
MD5 Overview (cont.) • Step 3: initialize 4-word (128-bit) MD buffer (A,B,C,D) to given values: • A=67452301, B=EFCDAB89, C=98BADCFE, D=10325476 Save the values in little-endian format (the least significant byte of a word in the low-address position) • Word A= 01 23 45 67, Word B= 89 AB CD EF, • Word C= FE DC BA 98 , Word D= 76 54 32 10 • Step 4: process message in 16-word (512-bit) blocks: • using 4 rounds of 16 bit operations on message block & buffer • add output to buffer input to form new buffer value • Step 5: After all L 512-bit blocks have been processed the output from the Lth stage is the 128-bit message digest (hash code).
Summary of MD5 Behavior • The MD5 behaviour can be summarized as: • CV0 = IV • CVq+1= SUM32[CVq,RFI(Yq,RFH(Yq,RFG(Yq,RFF(Yq,CVq))))] • MD = CVL-1 • Where: • IV: Initial value (stored in ABCD buffers) • Yq: the qth 512-bit block of the message • L: number of blocks in the message • CVq: chaining variable processed with the qth block • RFx: round function using primitive logical function x • SUM32: addition mod 232 performed separately on each word of the pair of inputs • MD: final message digest value
MD5 Compression Function • each round has 16 steps of the form: a = b + ((a + g(b,c,d) + X[k] + T[i]) <<< s) • a,b,c,d refer to the 4 words of the buffer, but used in varying permutations • note this updates 1 word only of the buffer • after 16 steps each word is updated 4 times • where g(b,c,d) is a different nonlinear function in each round (F,G,H,I) (see book for details) • X[k]=M[q*16+k]=the kth 32-bit word in the qth 512-bit block of the message • T[i] is a constant value derived from sin, that is T[i] = 232 * abs[sin(i)] and can be found in a lookup table (matrix T) • <<< s is circular shift of the 32-bit argument by s bits • All additions are modulo 232
MD5’s Logical Functions • In terms of logical operations: • F(b,c,d) = bc + b’c • G(b,c,d) = bd + cd’ • H(b,c,d) = b c d • I(b,c,d) = c (b + d’)
MD5 Compression Function - Single Step Part of Message Constants Circular Left Shift (rotation) by s bits
MD4 • precursor to MD5 • also produces a 128-bit hash of message • has 3 rounds of 16 steps versus 4 in MD5 • design goals: • collision resistant (hard to find collisions) • direct security (no dependence on "hard" problems) • fast, simple, compact • favours little-endian (the least significant bytes in the low-address byte position) systems (e.g. Intel’s 80xxx and Pentium)
Strength and Weakness of MD5 • MD5 hash is dependent on all message bits • Rivest claims security is good as can be • known attacks are: • Berson 92 attacked any 1 round using differential cryptanalysis (but can’t extend) • Boer & Bosselaers 93 found a pseudo collision (again unable to extend) • Dobbertin 96 created collisions on MD compression function (but initial constants prevent exploit) • conclusion is that MD5 looks vulnerable soon • Two new alternatives: SHA-1 and RIPEMD-160
Secure Hash Algorithm (SHA-1) • SHA was designed by National Institute of Standards and Technology (NIST) & NSA in 1993, revised 1995 as SHA-1 • US standard for use with DSA signature scheme • standard is FIPS 180-1 1995, also Internet RFC3174 • the algorithm is SHA, the standard is SHS • produces 160-bit hash values • now the generally preferred hash algorithm • based on design of MD4 with a few key differences
SHA Overview • pad message so that we have: length mod 512 = 448 or equivalently length ≡ 448 (mod 512) • append a 64-bit length value to message • initialize 5-word (160-bit) buffer (A,B,C,D,E) to the following using big-endian format: (67452301, efcdab89, 98badcfe, 10325476, c3d2e1f0) • process message in 16-word (512-bit) chunks: • expand 16 words into 80 words by mixing & shifting • use 4 rounds of 20 bit operations on message block & buffer • add output to input to form new buffer value • output hash value is the final buffer value
Summary of SHA-1 Behavior • The SHA-1 behaviour can be summarized as: • CV0 = IV • CVq+1= SUM32 [CVq, ABCDEq] • MD = CVL • Where: • IV: Initial value (stored in ABCDE buffers) • ABCDEq: the output of the last round of processing in the qth 512-bit block of the message • L: number of blocks in the message (including padding and the length fields) • CVq: chaining variable processed with the qth block • SUM32: addition mod 232 performed separately on each word of the pair of inputs • MD: final message digest value
SHA-1 Compression Function • each round has 20 steps which replaces the 5 buffer words thus: [A,B,C,D,E][(E+f(t,B,C,D)+S5(A)+Wt+Kt),A,S30(B),C,D] • a,b,c,d refer to the 4 words of the buffer • t is the step number (0≤t≤79) • Sk: circular left-shift (rotation) of the 32-bit argument by k bits (same as “<<< k”) • f(t,B,C,D) is a nonlinear function for round • Wt is derived from the message block • Kt is a additive constant value derived from integer part of 232 x i0.5 for i=2,3,5,10. • All +’s are modulo 232 additions
SHA-1 Compression Function Circular Left Shift (rotation) by k bits
Logical Functions f • In terms of logical operations: • 0≤t≤19 f1= f(t,B,C,D)= BC + B’D • 20≤t≤39 f2= f(t,B,C,D)= B C D • 40≤t≤59 f3= f(t,B,C,D)= BC + BD + CD • 60≤t≤79 f4= f(t,B,C,D)= B C D
Additive Constant Kt • Only 4 distinct constants are used:
32-Bit Word Values Wt • The first 16 values are taken directly from the 16 words of the current blocks. • The remaining values are computed as: Wt = S1 (Wt-16 Wt-14 Wt-8 Wt-3)
SHA-1 versus MD5 • brute force attack is harder (160 vs 128 bits for MD5) • not vulnerable to any known attacks (compared to MD4/5) • a little slower than MD5 (80 vs 64 steps) • both designed as simple and compact • optimized for big-endian CPU's (vs MD5 which is optimised for little-endian CPU’s)
Revised Secure Hash Standard • NIST have issued a revision FIPS 180-2 • adds 3 additional hash algorithms • SHA-256, SHA-384, SHA-512 • designed for compatibility with increased security provided by the AES cipher • structure & detail is similar to SHA-1 • hence analysis should be similar
RIPEMD-160 • RIPEMD-160 was developed in Europe as part of RIPE (RACE Integrity Primitive Evaluation) project in 1996 • by researchers involved in attacks on MD4/5 • initial proposal strengthen following analysis to become RIPEMD-160 • somewhat similar to MD5/SHA • uses 2 parallel lines of 5 rounds of 16 steps • creates a 160-bit hash value • Slower than MD5, but probably more secure than SHA and MD5
RIPEMD-160 Overview • pad message so that: length mod 512 = 448 • append a 64-bit length value to message • initialize 5-word (160-bit) buffer (A,B,C,D,E) to the following in little-endian format: (67452301, efcdab89, 98badcfe, 10325476, c3d2e1f0) • process message in 16-word (512-bit) chunks: • use 10 rounds of 16 bit operations on message block & buffer – in 2 parallel lines of 5 • add output to input to form new buffer value • output hash value is the final buffer value
RIPEMD-160 Round • Each round take as inputs the current 512-bit block (Yq) and the 160-bit buffer ABCDE (left line) or A’B’C’D’E’ (right line) and updates the content of the buffer • Overall: • CVq+1(0)=CVq(1)+C+D’ • CVq+1(1)=CVq(2)+D+E’ • CVq+1(2)=CVq(3)+E+A’ • CVq+1(3)=CVq(4)+A+B’ • CVq+1(4)=CVq(0)+B+C’
RIPEMD-160 Compression Function A 32-bit from current 512-bit block; chosen by a permutation function r(j) Circular Left Shift (rotation) by k determined by s(j)
Functions f • In terms of logical operations: • 0≤t≤15 f1= f(t,B,C,D)= B C D • 16≤t≤31 f2= f(t,B,C,D)= BC + B’D • 32≤t≤47 f3= f(t,B,C,D)= (B + C’) D • 48≤t≤63 f4= f(t,B,C,D)= BD + CD’ • 64≤t≤79 f5= f(t,B,C,D)= B (C + D’)
RIPEMD-160 Design Criteria • use 2 parallel lines of 5 rounds for increased complexity • for simplicity the 2 lines are very similar • step operation very close to MD5 • permutation varies parts of message used • circular shifts designed for best results
RIPEMD-160 versus MD5 & SHA-1 • brute force attack harder (160 like SHA-1 vs 128 bits for MD5) • not vulnerable to known attacks, like SHA-1 though stronger (compared to MD4/5) • slower than MD5 (more steps) • all designed as simple and compact • SHA-1 optimized for big-endian CPU's vs RIPEMD-160 & MD5 optimized for little-endian CPU’s
Keyed Hash Functions as MACs • have desire to create a MAC using a hash function rather than a block cipher • because hash functions (e.g. MD5 and SHA-1) are generally faster than symmetric block cipher like DES • library code for cryptographic hash functions is widely available • not limited by export controls unlike block ciphers • hash includes a key along with the message • original proposal: KeyedHash = Hash(Key||Message) • some weaknesses were found with this • eventually led to development of HMAC (now mandatory for IP Security protocols, SSL, etc.)
HMAC Algorithm • specified as Internet standard RFC2104 • uses hash function on the message: HMACK(M)= H[(K+ opad)|| H[(K+ ipad)|| M)]] • where K is the secret key and K+ is the key padded out with 0’s to size b (b is the number of bits in a block) • and opad (5C hex), ipad (36 hex) are specified padding constants repeated b/8 times • overhead is just 3 more hash calculations than the message needs alone • any of MD5, SHA-1, RIPEMD-160 can be used
HMAC Overview • Append zeros to the left end of K to create a b-bit string K+ • XOR K+ with ipad to produce b-bit block Si • Append M to Si • Apply H to the stream generated in step 3 • XOR K+ with opad to produce b-bit block So • Append the hash result from step 4 to So • Apply H to the stream generated in step 6 and output the final result.
Efficient Implementation of HMAC f(cv,block) is the compression function for the hash function (the precomputed values substitute IV).
HMAC Security • know that the security of HMAC relates to that of the underlying hash algorithm • attacking HMAC requires either: • brute force attack on key used. This is in order of 2n where n is the chaining variable bit-width. • birthday attack (but since keyed would need to observe a very large number of messages). Like MD5 this is in order of 2n/2 for a hash length of n. • choose hash function used based on speed versus security constraints
HMAC Security (cont.) • Note that HMAC is more secure than MD5 for birthday attack. • In MD5 the attacker can choose any set of messages to find a collision (i.e. H(M)=H(M’)). • In HMAC since the attacker does not know K, he cannot generate messages offline. For a hash code of 128 bits, this requires 264 observed blocks (i.e. 264 * 29=273 bits) generated using the same key. On a 1 Gbps line, this requires monitoring stream of messages with no change of the key for 250,000 years (quite infeasible!!)