Collision Resolution Strategies

Open Address Method: The general approach is to use a 'probe sequence' H1(key) + Hm(key) (mod tablesize) for m=1,2,3,4, ...

Hm tells you which slot to try next if the previous slots were all in use.

Linear Rehashing: This is the open address method that uses Hm(key) = m. The probe sequence is H1(key) + m. One advantage of this is simplicity. Proliferation of "deleted" cells can clutter the table and slow searches. One partial solution is to move up elements after deletions, or do such compaction from time to time.

External Chaining: Each table slot contains a linked list which can grow dynamically.

Advantages of External Chaining Coalesced Chaining: The table contains link fields and a cellar for overflow.



0
1
2       address region
3
4

----------------------------

5
6         cellar

H maps into the address region only (using division in this example). The cellar is for keys that need to be rehashed from the address region.

The rule is to place a colliding key into the empty place with the largest address (epla).

The example below illustrates what happens when the key sequence is 27, 29, 32, 34, 37, 47, and 53.

Note that the probe sequence for overflows from cell 3 'coalesces' in the end with the probe sequence for overflows from cell 2.


table        table         table 
address      contents      contents
             (data)          (link)

0              53              nil
1              47               0 <-- coalesced link
2              27               6
3              37               1
4              29               5

--------------------------------------------

5              34              nil
6              32               3

Empirical studies show that a cellar about 15% of the size of the main table works well. The search effort is not much more than with external chaining. Deletion can be done without resorting to marking records "deleted" but deletion is more complicated than in the case of external chaining, because lists can coalesce. There is a problem finding the predecessor of a search key. (Vitter wrote extensively about coalesced chaining.) Coalesced chaining tends to conserve memory better than open addressing or external chaining in cases where the hashed elements are small and load factors are high.