A HASH TABLE is a non-sequential data structure that uses a HASHER to evenly distribute entries inside into buckets for amortized O(1) insertion/search/deletion performance. To understand rationale behind, let's start with its logical coordinates:
template<typename T, int (*comparator)(const T&, const T&), std::size_t (*hasher)(const T&)>
class HashTable {
...
private:
LinkedList<T>* bucketList; // preallocated at CAPACITY size
std::size_t capacity; // size of bucketList
std::size_t size; // number of entries in data structure
};Its most common usages are implementations of abstract data types:
How are ENTRY items inserted into a hash table:
If insertion is attempted but SIZE (number of items) will exceed CAPACITY, BUCKET LIST needs to grow by doubling its CAPACITY, same as a dynamic array. However, unlike latter, growth will be much more costly as it requires hash table to:
A HASH MAP implements Associative Array abstract data type using HASH TABLE above. In order to do so, it needs to derive hash table coordinates specification:
template<typename K, typename V>
struct MapEntry {
K key;
V value;
};
template<typename K, typename V, int (*comparator)(const K&, const K&), std::size_t (*hasher)(const K&)>
class HashMap {
...
private:
LinkedList<MapEntry<K, V>>>* bucketList; // preallocated at CAPACITY size
std::size_t capacity; // size of bucketList
std::size_t size; // number of entries in data structure
};Insertion steps specification derives into:
| Operation | Description |
| clear | Deallocates BUCKETS LIST, reallocates new BUCKETS LIST and resets SIZE to zero. |
| containsKey | Applies HASHER to ENTRY KEY to detect BUCKET then loops through all ENTRIES inside and applies COMPARATOR to see if any KEY matches |
| containsValue | For each BUCKET in BUCKET LIST, loops through all ENTRIES inside and applies VALUE COMPARATOR to see if any VALUE matches |
| get | Applies HASHER to ENTRY KEY to detect BUCKET then loops through all ENTRIES inside and applies COMPARATOR to see if any KEY matches. If it does, returns VALUE of ENTRY. If not, ERROR thrown |
| set | Applies HASHER to ENTRY KEY to detect BUCKET then loops through all ENTRIES inside and applies COMPARATOR to see if any KEY matches. If it does, updates VALUE of existing ENTRY. If not, writes ENTRY and increments SIZE |
| removeKey | Applies HASHER to ENTRY KEY to detect BUCKET then loops through all ENTRIES inside and applies COMPARATOR to see if any KEY matches. If it does, removes ENTRY from BUCKET and decrements SIZE. If not, ERROR thrown |
| removeValue | For each BUCKET in BUCKET LIST, loops through all ENTRIES inside and applies VALUE COMPARATOR to see if any VALUE matches. If it does, removes ENTRY from BUCKET and decrements SIZE. If none match, ERROR thrown |
| isEmpty | Checks if SIZE is zero |
| size | Gets value of SIZE |
As you can see above, two more coordinates are put forward:
For a complete implementation example, check HashMap class documentation or its GitHub code.
A HASH SET implements Set abstract data type using HASH TABLE above. Since hash table by itself fits SET requirements, no derivation is needed!
| Operation | Description |
| clear | Deallocates each BUCKET, then BUCKETS LIST, reallocates new BUCKETS LIST and resets SIZE to zero. |
| contains | Applies HASHER to ENTRY to detect BUCKET then loops through all ENTRIES inside and applies COMPARATOR to see if any ENTRY matches |
| add | Applies HASHER to ENTRY to detect BUCKET then loops through all ENTRIES inside and applies COMPARATOR to see if any ENTRY matches. If none does, writes ENTRY and increments SIZE |
| remove | Applies HASHER to ENTRY to detect BUCKET then loops through all ENTRIES inside and applies COMPARATOR to see if any ENTRY matches. If it does, removes ENTRY from BUCKET and decrements SIZE. If not, ERROR thrown |
| isEmpty | Checks if SIZE is zero |
| size | Gets value of SIZE |
As you can see above, one more coordinates was put forward:
For a complete implementation example, check HashSet class documentation or its GitHub code.
Overall, hash table presents itself with a very favorable speed and memory efficiency profile, which is why it should be solution of choice for Associative Array and Set implementation. It can also be used for many other implementations (storing nodes/vertexes/edges in a tree/graph, etc). That being said, it also comes with disadvantages along with advantages: