Linear Hashing

Hash table is a data structure that associates keys with values. To know more about liner hashing refer Wikipedia. Here are main points that summarizes linear hashing.

  • Full buckets are not necessarily split
  • Buckets split are not necessarily full
  • Every bucket will be split sooner or later and so all Overflows will be reclaimed and rehashed.
  • Split pointer s decides which bucket to split
    • s is independent to overflowing bucket
    • At level i, s is between 0 and 2i
    • s is incremented and if at end, is reset to 0.
  • hi (k)= h(k) mod(2i n)
  • hi+1 doubles the range of hi

Insertion and Overflow condition

Algorithm for inserting ‘k’:

1. b = h0(k)

2. if b < split-pointer then

3. b = h1(k)

Searching in the hash table for ‘k’:

1. b = h0(k)

2. if b < split-pointer then

3. b = h1(k)

4. read bucket b and search there


  • In the following M=3 (initial # of buckets)
  • Each bucket has 2 keys. One extra key for overflow.
  • s is a pointer, pointing to the split location. This is the place where next split should take place.
  • Insert Order: 1,7,3,8,12,4,11,2,10,13

After insertion till 12:

When 4 inserted overflow occurred. So we split the bucket (no matter it is full or partially empty). And increment pointer.

So we split bucket 0 and rehashed all keys in it. Placed 3 to new bucket as (3 mod 6 = 3 ) and (12 mod 6 = 0 ). Then 11 and 2 are inserted. And now overflow. s is pointing to bucket 1, hence split bucket 1 by re- hashing it.

After split:

Insertion of 10 and 13: as (10 mod 3 = 1) and bucket 1 < s, we need to hash 10 again using h1(10) = 10 mod 6 = 4th bucket.

When 13 is inserted same process is done, but it end up to the same bucket. But here is an overflow, we need to split 2nd bucket.

Here is the final hash table.

s is moved to the top again as one cycle is completed. Now s will travel from 0 to 5th bucket, then 0 to 12, etc;

3 thoughts on “Linear Hashing