Class Notes (811,169)
Canada (494,539)
CMPT 225 (60)
John Edgar (28)

CMPT 225 Week 10 Lecture 3

3 Pages
Unlock Document

Simon Fraser University
Computing Science
CMPT 225
John Edgar

Hash Table Collisions Occurs when 2 different keys are mapped to the same index 2 main ways of dealing with them: Open addressing Separate chaining Open Addressing: When insertion results in collision, look for somewhere else to put the value Start at index to which hash function mapped the inserted item Look for free space in array following a search patten (probing) 3 schemes: Linear probing (bad) Quadratic probing (fine) Double hashing (good) Linear: table searched sequentially Each time probed for free location, add one to index.At end, wrap around. Just look for next empty space in array. Leads to primary clustering. Table contains groups of consecutively occupied locations, clusters get larger as time goes on, reduces efficiency. Searching with linear probing: Searching similar to insertion. Search for 59, 59 mod 23 = 13, index 13 does not contain 59, but it contains something else. Cannot yet conclude that table does not contain 59. Cannot conclude until you get to a free space. Probe, check, add one to index, continue until hit 59 or free space. Quadratic Probing: Refinement of linear, prevents primary clustering Don't just add one to index, add p^2. p is number of times you've had to do this probing.Always go from original index. 1st probe: h(x) + 1^2 Results in secondary clustering. Same sequence of probes is used when two different values hash to same location. Delays collision resolution. Problem theoretically - analysis shows that it isn't really a problem in practice. After some time, a sequence of probes repeats itself. If you loop like this, cannot insert that new value. Generally does not cause problems if data not significantly skewed, hash table large enough (nearest prime to 2*items to insert), function scatters data evenly across table Not likely to happen unless you write a really really bad hash function. Also think about table size 50000~ compared to table size 7 Also if your table is full of stuff. But don't want it to get over half full. Performance of hash tables using open addressing degrades if over half full If more than 2/3 full, need to do something. Double Hashing: In linear and quadratic, probing sequence is independant of key. Double hashing is key dependant probe sequence Asecond hash function determine's the probes sequence. Will get rid of secondary clusters, hopefully primary too. Must foll
More Less

Related notes for CMPT 225

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.