50 likes | 57 Views
Additional notes on Hashing. And notes on HW4. Selected Answers to the Last Assignment. The records will hash to the following buckets: K h(K) (bucket number) 2369 1 3760 0 4692 4 4871 7 5659 3 1821 5 1074 2 7115 3 1620 4 2428 4 overflow 3943 7 4750 6 6975 7 overflow 4981 5
E N D
Additional notes on Hashing And notes on HW4
Selected Answers to the Last Assignment • The records will hash to the following buckets: • K h(K) (bucket number) • 2369 1 • 3760 0 • 4692 4 • 4871 7 • 5659 3 • 1821 5 • 1074 2 • 7115 3 • 1620 4 • 2428 4 overflow • 3943 7 • 4750 6 • 6975 7 overflow • 4981 5 • 9208 0 • Two records out of 15 are in overflow, which will require an additional block access. The • other records require only one block access. Hence, the average time to retrieve a • random record is: • (1 * (13/15)) + (2 * (2/15)) = 0.867 + 0.266 = 1.133 block accesses
Now if you were asked to do this with extendible hashing • record1 2369 1 00001 • record2 3760 16 10000 • record3 4692 20 10100 • record4 4871 7 00111 • record5 5659 27 11011 • record6 1821 29 11101 • record7 1074 18 10010 • record8 7115 11 01011 • record9 1620 20 10100 • record10 2428 28 11100 • record11 3943 7 00111 • record12 4750 14 01110 • record13 6975 31 11111 • record14 4981 21 10101 • record15 9208 24 11000
Lets go through the first few steps • K mod 2 (2369 and 4871) in bucket 1, (3760 and 4692) in bucket 0. • Now the next record is going to cause a split to K mod 4 • The key in extendible hashing is only the bucket that overflow needs to be split – for the other buckets you just use pointer de-referencing • record 5659 – will leave bucket 0 unchanged but the pointer derferences will need to be fixed • Hash element 00 will both point to bucket 0 • Hash element 11 will point to the bucket containing records 4871 and 5659 (both ending in 11) and Hash element 01 will point to record containing 2369. • Then 1821 will get added to the bucket containing 2369. • Now adding 1074 – the hash entry 10 will point to the new block containing this element • Now adding record 8 (01011) will cause a split and we will need to go to K Mod 8. • And so on.
Now if you were to do extendible hashing by remapping (linear hash) • This one simply remaps all the data from scratch while using the next hash function • K mod 2, K mod 4, K mod 8 • The tradeoff is access cost is fixed constant – no dereferencing • The disadvantage is remapping frequently.