Data needs to be accessed really really quickly, but does not have to be maintained in order. Order is
not important, don't care about "adjacent" values or anything. Don't want to print out a bunch of stuff in
Balanced BST - but maintain data in order, which we don't care about or need
Arrays - if we know index of data, lookup can be performed in constant time
Want unique int for each possible string.
Sum the letters not very good for that
Assign each letter val between 1-26
Multiple letter's value by 26^i, which is position of letter in word
Then sum letters
dog = 4*26^2 + 15*26^1 + 7*26^0 = 3101
Most strings are not meaningful. Vast majority are random letters showed together and are not actually
people's names. Not practical to create an array large enough to store all possible strings.
But we want an array to take advantage of constant time lookup!
Fix array size based on amount of data actually want to store.
Map key value to int index for array. This is done with a hash function.
Problem: mapping not always guaranteed to give different indices. Can have collisions.
Hash table = array to store data + hash function (maps key to array index)
Can get int key, then modulo size of the array, will give mapping to index in array.
Good hash function can significantly reduce number of collisions, but cannot eliminate all of them.
Still need to deal with collisions that happen.
convert key value to int x
h(x) = x mod tableSize
Want keys to be evenly distributed over underlying array. Can