Hashing Algorithm
A hash function takes any input and produces a fixed-size number. Same input, same output. Always.
21 Mar 2024

A hash function takes any input and produces a fixed-size number. Same input, same output. Always.
That's it. That's hashing. But from this simple idea, you get hash maps, password storage, data integrity checks, and distributed caching.
What Makes a Good Hash Function
Three properties matter:
- Deterministic — same key always gives the same hash.
- Uniform distribution — hashes spread evenly across the output space. Clusters cause collisions.
- Fast — hashing should be cheaper than the operations it enables.
A Simple Hash Function
function simpleHash(key, tableSize) {
let hash = 0;
for (let i = 0; i < key.length; i++) {
hash += key.charCodeAt(i);
}
return hash % tableSize;
}
console.log(simpleHash("hello", 10)); // 2
console.log(simpleHash("world", 10)); // 2 — collision!
This sums character codes and takes the modulo. It works, but it's a bad hash function. "hello" and "olleh" produce the same hash. Anagrams always collide.
A Better Hash Function
Weight each character by its position.
function betterHash(key, tableSize) {
let hash = 0;
const prime = 31;
for (let i = 0; i < key.length; i++) {
hash = (hash * prime + key.charCodeAt(i)) % tableSize;
}
return hash;
}
console.log(betterHash("hello", 10)); // 2
console.log(betterHash("olleh", 10)); // 8 — different!
The prime multiplier makes the hash sensitive to character order. This dramatically reduces collisions.
Handling Collisions
No hash function eliminates collisions entirely. Two strategies:
- Chaining — each bucket holds a list. Collisions just add to the list. Simple. Works well with a good hash function.
- Open addressing — if a slot is taken, probe the next slot (linear probing, quadratic probing, or double hashing). Better cache performance but more complex.
Complexity
- Hash computation: O(k) where k is the key length.
- Lookup with good hash: O(1) average.
- Lookup with bad hash: O(n) worst case (everything in one bucket).
The Trade-off
Hashing gives you O(1) lookups — but only on average. The worst case is O(n). A good hash function and proper table sizing keep you in O(1) territory. A bad hash function turns your hash map into a linked list.
Hashing also doesn't preserve order. If you need sorted data or range queries, a tree structure is a better choice.