A hash table is a data structure used to implement an associative array, a structure that can map keys to values.In a hash table, data is stored in an array format where each data has a unique index value.Accessing the data becomes very fast if we knows the index.The efficiency of the mapping depends on the efficiency of the hash function used.
It is the data structure that satisfies the demand for a data structure that can perform atleast the search/retrieval operation in constant time complexity O(1)
-
- Database Indexing
-
- Caching: Hashing is used in cache memory to quickly look up data based on its key.
-
- Password Storage: Hashing is used to securely store passwords by turning a password into a fixed-length hash value that can be compared for authentication purposes, without having to store the original password.
etc...
Index Mapping (also known as Trivial Hashing) is a simple form of hashing where the data is directly mapped to an index in a hash table. The hash function used in this method is typically the identity function, which maps the input data to itself. In this case, the key of the data is used as the index in the hash table, and the value is stored at that index.
H(k) = k % 8 (i.e, range from 0 to 7)
A collision in a hash table occurs when two different keys produce the same index (hash value) in the array. This means that the same bucket in the array is being used to store different key-value pairs, which can lead to ambiguity and incorrect behavior.
- Direct Addressing (open addressing)
- Seperate Chaining (chaining)
This method involves searching for the next available bucket in the array if the original index is already occupied. There are several variations of open addressing, including linear probing, quadratic probing, and double hashing.
Linear probing
:- In this method, the next available bucket is found by incrementing the index until an empty bucket is found. This can result in clusters of occupied buckets, which can reduce the overall performance of the hash table.Quadratic probing
:- In this method, the next available bucket is found by incrementing the index using a quadratic function, such as i^2, where i is the number of probes. This method helps to avoid the clustering issue seen in linear probing.Double hashing
:- In this method, a secondary hash function is used to determine the next index to probe. This method helps to ensure that keys are distributed uniformly across the indices of the array, reducing the likelihood of collisions.
This method involves creating a linked list at each index in the array, so that multiple key-value pairs can be stored in the same bucket. When a collision occurs, the new key-value pair is simply added to the end of the linked list at that index.