Thursday, April 3, 2014

Compare and contrast a hash table vs. an STL map

Compare and contrast a hash table vs. an STL map. How is a hash table implemented? If the number of inputs is small, what data structure options can be used instead of a hash table?
Hash Table
In a hash table, a value is stored by applying hash function on a key. Thus, values are not stored in a hashtable in sorted order. Additionally, since hash tables use the key to find the index that will store the value, an insert/lookup can be done in amortized O(1) time (assuming only a few collisions in the hashtable). One must also handle potential collisions in a hashtable. More details on hashtable here.

In an STL map, insertion of key/value pair is in sorted order of key. It uses a tree to store values, which is why an O(log N) insert/lookup is required. There is also no need to handle collisions An STL map works well for things like:
  • find min element
  • find max element
  • print elements in sorted order
  • find the exact element or, if the element is not found, find the next smallest number
Advantage of using a map, which also stores key/data pair, but doesn't need to statically allocate a huge hash table with many empty slots

How is a hash table implemented?
  1. A good hash function is required (e.g.: operation % prime number) to ensure that the hash values are uniformly distributed.
  2. A collision resolving method is also needed: chaining (good for dense table entries), probing (good for sparse table entries), etc.
  3. Implement methods to dynamically increase or decrease the hash table size on a given criterion. For example, when the [number of elements] by [table size] ratio is greater than the fixed threshold, increase the hash table size by creating a new hash table and transfer the entries from the old table to the new table by computing the index using new hash function.
What can be used instead of a hash table, if the number of inputs is small?
You can use an STL map Although this takes O(log N) time, since the number of inputs is small, this time is negligible.

To sum up: Similarity is both are key value data structure.
But differences are
hashtable stl map
Takes O(1) search time Take O(log N) time
Stores key value store, but they are not sorted Stores key value store, but in sorted order, and hence we can also get min, max functions
Need to take care of hashing function and collisions No need to do so
Takes a large space to maintain the empty array Doesn't take that much space



Post a Comment