150 likes | 272 Views
Travis Roe Topics of Computer Science Chapter 43 2-5-2006 . Storage by Hashing. Outline. A Problem A Solution: Hashing Questions Q & A. A Problem. Company organizing data using social security numbers, or similar.
E N D
Travis Roe Topics of Computer Science Chapter 43 2-5-2006 Storage by Hashing
Outline • A Problem • A Solution: Hashing • Questions • Q & A
A Problem • Company organizing data using social security numbers, or similar. • Need to add and search through collections of identifiers to find objects.
Potential Solution: 1-1 • One index per potential location • Adding: O(1) • Searching: O(1) • Pros: Very fast, very easy to implement • Cons: Far too much memory, much of it unused
Potential Solution: Unsorted Array • Adding. O(1) • Searching. O(n) • Pros: Easy to implement, fast adding. • Cons: Everything else. O(n) ridiculously slow.
Potential Solution: Sorted Array • Adding. O(n) • Searching. O(lg n) • Pros: Fast searching. • Cons: Slow adding.
Potential Solution: Balanced BST • Adding. O(lg n) • Searching. O(lg n) • Pros: Fast speed for adding, searching. • Cons: Hard to program. Not O(1).
A New Solution: Hashing • Adding. • Use the keys to choose an index. • Place the object at the index. • O(1) • Searching. • Use the keys to find the index. • Get the object from that index. • O(1)
Hashing: An Example 154-38-1287 1287 • Social Security Numbers are the key • The hash-key is based off the last 4 digits of the number ... 987-65-4321 4321 ... 123-45-6789 6789 ... 192-83-7465 7465
Collisions • Expected problems: • Two objects with the same key • Two keys, after hashing, with same value. • Ways to solve the problems: • Chaining • Probing
Collision Handling: Chaining • Every node is a list of some sort. • Whenever there is a collision, put the new item into the list.
Collision Handling: Probing • Whenever there is a collision, go to another location some distance away and attempt to fill that location. • Can cause grouping. • h(k) + a * x; a = 2 123-45-6789 543-21-6789
Reducing Collisions • Use prime numbers for array sizes • Take more space than you'll need • Choose a better hash function
References • Dewdney, A.K. “Storage By Hashing”. The New Turing Omnibus. 1993. Computer Science Press. • “Hash Tables”. Recording My Programming Path. http://qiang-ma.blogspot.com/2007/10/hash-tables.html <Accessed last 2-5-2008> • Standish, Thomas. Data Structures, Algorithms & Software Principles in C. 1995. Addison-Wesley Publishing Company, Inc. pp450-475(ish)
Questions • What are the two methods for handling collisions that were discussed in this lecture? • What is one situation where hash-tables are not useful in?