Hash Sets and Hash Maps: O(1) Average-Case Lookups
Master hash table fundamentals, collision handling, load factor, and practical applications of hash sets and maps.
Hash Sets and Hash Maps: O(1) Average-Case Lookups
Hash tables are the workhorses of computer science — O(1) average-case insertion, deletion, and lookup by converting keys into array indices via a hash function. This eliminates the linear scanning that makes arrays and linked lists painfully slow once your dataset grows.
This article covers how hash tables work, why they sometimes degrade, and how collisions are actually handled. You’ll see how they’re built into the tools you use every day: Python’s dict, Java’s HashMap, JavaScript’s object, and most caching layers in production systems. You’ll also understand the edge cases — what happens when collisions cluster, and why attackers can exploit predictable hash functions to trigger worst-case behavior.
The core of a hash table is the hash function, which maps arbitrary keys to array indices. A good one spreads keys evenly across the table, keeping collisions rare. When collisions do happen — because the key space is larger than the table, or two different keys just happen to hash to the same spot — there are two main ways to handle it: separate chaining (linked lists per bucket) and open addressing (finding another empty slot via probing). Each has real trade-offs in performance and memory that show up under heavy load or adversarial input.
Hash Function Fundamentals
def hash_function(key, table_size):
"""
Simple hash functions for different key types.
A good hash function:
- Distributes keys uniformly across table
- Is deterministic (same input → same output)
- Is fast to compute
"""
# For integers
if isinstance(key, int):
return key % table_size
# For strings (polynomial rolling hash)
if isinstance(key, str):
hash_val = 0
for char in key:
hash_val = (hash_val * 31 + ord(char)) % table_size
return hash_val
# For tuples
if isinstance(key, tuple):
hash_val = 0
for item in key:
hash_val = (hash_val * 31 + hash(item)) % table_size
return hash_val
# For custom objects - use their __hash__ method
return hash(key) % table_size
# Better hash for strings - polynomial with different bases
def better_string_hash(s, table_size, base=37, mod=10**9 + 9):
"""Reduces collisions for similar strings."""
hash_val = 0
power = 1
for char in s:
hash_val = (hash_val + (ord(char) - ord('a') + 1) * power) % mod
power = (power * base) % mod
return hash_val % table_size
Hash Map Implementation
class HashMap:
"""
Hash map with separate chaining for collision resolution.
O(1) average case for insert, lookup, delete.
"""
def __init__(self, initial_capacity=16, load_factor_threshold=0.75):
self.capacity = initial_capacity
self.size = 0
self.load_factor_threshold = load_factor_threshold
self.buckets = [[] for _ in range(self.capacity)] # Separate chaining
def _get_bucket(self, key):
"""Find bucket index for key."""
return hash(key) % self.capacity
def _resize(self):
"""Double capacity and rehash when load factor exceeded."""
old_buckets = self.buckets
self.capacity *= 2
self.buckets = [[] for _ in range(self.capacity)]
self.size = 0
# Rehash all existing key-value pairs
for bucket in old_buckets:
for key, value in bucket:
self.insert(key, value)
def insert(self, key, value):
"""O(1) average case insert or update."""
# Check load factor
if self.size / self.capacity >= self.load_factor_threshold:
self._resize()
bucket = self.buckets[self._get_bucket(key)]
# Update existing key
for i, (k, v) in enumerate(bucket):
if k == key:
bucket[i] = (key, value)
return
# Insert new key-value pair
bucket.append((key, value))
self.size += 1
def get(self, key, default=None):
"""O(1) average case lookup."""
bucket = self.buckets[self._get_bucket(key)]
for k, v in bucket:
if k == key:
return v
return default
def delete(self, key):
"""O(1) average case deletion."""
bucket = self.buckets[self._get_bucket(key)]
for i, (k, v) in enumerate(bucket):
if k == key:
del bucket[i]
self.size -= 1
return True
return False
def contains(self, key):
"""O(1) average case membership check."""
return self.get(key, None) is not None
def __len__(self):
return self.size
def __getitem__(self, key):
val = self.get(key)
if val is None:
raise KeyError(key)
return val
def __setitem__(self, key, value):
self.insert(key, value)
Hash Set Implementation
class HashSet:
"""
Hash set with separate chaining.
Stores unique elements with O(1) average insert, delete, lookup.
"""
def __init__(self, initial_capacity=16, load_factor_threshold=0.75):
self.capacity = initial_capacity
self.size = 0
self.load_factor_threshold = load_factor_threshold
self.buckets = [[] for _ in range(self.capacity)]
def _get_bucket(self, item):
return hash(item) % self.capacity
def add(self, item):
"""O(1) average case add."""
if self.contains(item):
return
if self.size / self.capacity >= self.load_factor_threshold:
self._resize()
bucket = self.buckets[self._get_bucket(item)]
bucket.append(item)
self.size += 1
def remove(self, item):
"""O(1) average case removal."""
bucket = self.buckets[self._get_bucket(item)]
for i, elem in enumerate(bucket):
if elem == item:
del bucket[i]
self.size -= 1
return True
return False
def contains(self, item):
"""O(1) average case membership check."""
bucket = self.buckets[self._get_bucket(item)]
return any(elem == item for elem in bucket)
def union(self, other):
"""Combine two sets."""
result = HashSet(max(self.capacity, other.capacity))
for bucket in self.buckets:
for elem in bucket:
result.add(elem)
for bucket in other.buckets:
for elem in bucket:
result.add(elem)
return result
def intersection(self, other):
"""Common elements."""
smaller = self if self.size < other.size else other
larger = other if smaller == self else self
result = HashSet(smaller.capacity)
for bucket in smaller.buckets:
for elem in bucket:
if larger.contains(elem):
result.add(elem)
return result
Collision Resolution
| Method | Description | Pros | Cons |
|---|---|---|---|
| Separate Chaining | Linked list per bucket | Simple, handles many collisions | Extra memory for pointers |
| Open Addressing | Find next empty slot | No pointers, cache-friendly | Clustering issues |
| Linear Probing | Check next sequential slot | Simple | Primary clustering |
| Quadratic Probing | Check slots by quadratic function | Less clustering | Secondary clustering |
| Double Hashing | Second hash for step size | Best distribution | More computation |
# Open addressing with linear probing
class LinearProbingHashMap:
def __init__(self, size=16):
self.size = size
self.keys = [None] * size
self.values = [None] * size
def _probe(self, key, i):
"""Linear probing: (hash + i) % size."""
return (hash(key) + i) % self.size
def insert(self, key, value):
for i in range(self.size):
index = self._probe(key, i)
if self.keys[index] is None or self.keys[index] == key:
self.keys[index] = key
self.values[index] = value
return True
raise Exception("Hash table full")
def get(self, key):
for i in range(self.size):
index = self._probe(key, i)
if self.keys[index] is None:
return None
if self.keys[index] == key:
return self.values[index]
return None
When to Use Hash Tables
Use hash tables when:
- You need O(1) lookup by key
- You’re doing frequency counting or caching
- You need to check membership in large sets
- You’re implementing associative arrays or dictionaries
Don’t use hash tables when:
- You need ordered data (use trees instead)
- Key space is very small (array is simpler)
- You need range queries ( BST is better)
Hash Table Trade-Offs
| Aspect | Hash Table | Alternatives |
|---|---|---|
| Lookup | O(1) average | BST: O(log n) |
| Insert | O(1) average | BST: O(log n) |
| Ordered iteration | No | BST: yes, O(n) |
| Memory | O(n) with overhead | BST: O(n) |
| Worst case | O(n) collision | BST: O(log n) always |
| Predictable timing | No (varies with hash) | BST: yes |
When hash tables win: Sorted order not needed, average-case speed matters, keys are hashable with good distribution, memory is available.
When to prefer BST: Range queries needed, worst-case guarantee required, keys not well-suited for hashing, memory is tight.
Hash Table Performance
| Operation | Average Case | Worst Case |
|---|---|---|
| Insert | O(1) | O(n) |
| Lookup | O(1) | O(n) |
| Delete | O(1) | O(n) |
| Space | O(n) | O(n) |
Worst case occurs when all keys hash to the same bucket (poor hash function or malicious input).
Production Failure Scenarios and Mitigations
Hash tables fail in predictable ways that exploit their core design trade-offs.
Hash collision attack causing DoS
Scenario: An attacker discovers that your hash function produces predictable collisions. By submitting keys that all hash to the same bucket, they force O(n) operation per lookup instead of O(1). With sufficient input volume, this degrades service response times for all users.
Mitigation: Use a hash function with secret salt (SipHash) or one designed forDoS resistance (FNV-1a, MurmurHash3). Set maximum key length limits. Monitor per-bucket chain length and alert when distributions exceed statistical norms.
Resize storm during memory pressure
Scenario: When a hash table reaches its load factor threshold, it doubles its bucket count and rehashes all entries. For large tables, this triggers massive allocation and copying. If the system is already near memory limits, the resize can push it over, causing an OOM kill.
Mitigation: Pre-allocate based on expected size when possible. Use incremental resizing strategies (bucket-cuckoo rehashing) for large tables. Set alarms on memory usage approaching thresholds before resize begins.
Integer overflow in bucket index calculation
Scenario: A hash function returns a very large unsigned integer. When computing hash % num_buckets, if num_buckets is also large and the multiplication overflows, the bucket index becomes unpredictable, causing out-of-bounds access or data corruption.
Mitigation: Use saturated arithmetic or check for overflow before modulus operation. Validate bucket count stays within safe bounds. In languages without overflow protection, use language-level checked arithmetic or pre-compute bucket index with safe operations.
Resizing-induced latency spikes blocking GC threads
Scenario: In managed runtimes (JVM, .NET), a large hash table resize triggers allocation of a new bucket array and copying of millions of entries. This can block GC threads or trigger a full GC, causing multi-second latency spikes for all requests.
Mitigation: Size hash tables based on known capacity needs rather than letting them auto-resize. Use concurrent hash tables (ConcurrentHashMap) for multi-threaded access. Monitor GC pause times correlated with hash table operations.
Quick Recap Checklist
- Hash tables provide O(1) average-case for insert, lookup, delete
- Hash function maps keys to array indices
- Collisions resolved via chaining or open addressing
- Load factor = items / capacity; triggers resize when exceeded
- Open addressing has better cache locality but clustering problems
- Python dict and set use hash tables with open addressing
Observability Checklist
Keep hash table operations healthy by tracking these key signals.
Core Metrics
- Hash table size (number of entries)
- Bucket count and load factor (entries / bucket)
- Average chain length per bucket (size / bucket_count)
- Read/write operation latency (p50, p95, p99)
- Resize event frequency and duration
- Collision rate (chained vs single-element buckets)
Health Signals
- Load factor exceeding 0.75 (resize threshold approach)
- Average chain length exceeding 5 (hash function issues or attack)
- Resize duration > 100ms (large table resize blocking)
- Memory usage approaching configured limits
- Operation latency p99 > 10ms for single operations
Alerting Thresholds
- Load factor > 0.75: warning, prepare for resize
- Load factor > 0.9: critical, immediate resize needed
- Average chain length > 10: investigate hash function or attack
- Resize duration > 500ms: alert, operation blocking
- Memory usage > 85% of limit: alert
Distributed Tracing
- Trace hash table operations end-to-end
- Include table size, bucket count, and hit/miss in span attributes
- Correlate slow operations with resize events or GC pauses
- Track operation counts and error rates per table instance
Security Notes
Hash tables present specific security risks in adversary-exposed or multi-tenant environments.
HashDoS via algorithmic complexity attack
Risk: Hash tables with poor hash functions allow attackers to craft inputs that all collide, degrading lookups from O(1) to O(n). For web frameworks that use hash tables for parameter parsing, this can cause request processing to hang.
Mitigation: Use hash functions with non-deterministic output (SipHash with secret key) or randomized seeds per process restart. Set maximum input size for any key that will be hashed. Monitor request processing time distributions for anomalies.
Cache timing attacks on hash table lookups
Risk: If hash table lookup timing varies with which bucket is accessed (due to CPU cache effects), an attacker could infer the hash table’s internal structure by measuring access times, potentially revealing information about what keys are stored.
Mitigation: Use constant-time hash comparison for sensitive operations. In high-security contexts, consider using lookups that access all buckets or use blinding operations to equalize timing.
Integer overflow enabling buffer overflow
Risk: When computing hash(key) % num_buckets, integer overflow in the hash computation or modulus can produce out-of-bounds bucket indices. If the language allows unsafe array access, this could lead to memory corruption.
Mitigation: Use languages with overflow-safe arithmetic, or explicitly check for overflow before modulus. Validate bucket count and ensure bucket index stays within allocated bounds before any array access.
Key iteration exposing internal state
Risk: Iterating over a hash table in multi-tenant systems can expose the internal bucket structure, revealing how many entries exist and potentially which keys. If the iteration order depends on key values, it may leak information about key patterns.
Mitigation: Avoid exposing hash table iteration in multi-tenant contexts without proper authorization. Use immutable snapshot copies for iteration rather than live table access.
Interview Questions
Since Python 3.7, dicts are guaranteed to maintain insertion order. This was always true for CPython implementation, but only officially guaranteed since 3.7. The hash table uses open addressing with a hybrid probing scheme— it preserves insertion order because items are never relocated after insertion (except during resize). This made OrderedDict deprecated for most use cases. Note that this is an implementation detail, not a theoretical guarantee of hash tables in general.
When load factor exceeds the threshold (typically 0.7-0.75), performance degrades because collisions increase. The table is resized: capacity doubles, and all items are rehashed into the new table. Each item's new bucket is recomputed since the modulo changes. This is why hash table operations are amortized O(1)—occasional O(n) resize is spread across many operations. If you know the expected size upfront, pre-size the table to avoid resizing.
A hash collision occurs when two different keys hash to the same index. This is inevitable by the pigeonhole principle when keys > table slots. Resolution methods: Separate chaining stores multiple items per bucket in a linked list. Open addressing finds another empty slot via probing (linear, quadratic, or double hashing). Good hash functions minimize collisions; proper load factor management limits their impact.
Define __hash__ and __eq__ methods consistently.
If two objects are equal (__eq__ returns True), they
must have the same hash value. Choose hashable fields as
your hash basis. For example, a Person class with first and last name should
hash on (first_name, last_name) and compare equality the same way. If you want
case-insensitive matching, normalize case in both methods. Never define hash
based on mutable fields that might change while the object is in a set.
In the worst case, all operations degrade to O(n). This happens when:
- All keys hash to the same bucket due to a poor hash function (e.g., a constant hash function)
- Malicious input exploits predictable hashing (HashDoS attack)
- Load factor becomes very high, causing long probe sequences or bucket chains
- The hash function does not distribute keys uniformly across buckets
The O(n) worst case is why production systems use randomized hash functions (like SipHash) and maintain load factor thresholds (typically 0.7-0.75).
Python uses open addressing with a hybrid probing scheme based on the Perturbed PyPy-like table design:
- Uses a single contiguous array of entries (key/ value for dict, key-only for set)
- When a collision occurs, it probes using:
index = (hash | perturb) & maskwhere perturb is updated by right-shifting each iteration - This combines elements of linear probing and pseudo-random probing
- Deleted entries are marked as "dummy" (tombstone) to preserve probe chains
- Resizing reconstructs the table from scratch, eliminating all tombstones
Separate Chaining:
- Each bucket holds a linked list (or tree) of entries
- Handles many collisions gracefully — table never fills up
- Requires more memory for pointer overhead
- Cache performance is poor due to non-contiguous memory access
- Deletion is simple — just remove from the chain
Open Addressing:
- Store entries directly in table slots; find next free slot on collision
- More memory-efficient (no pointers), better cache locality
- Table can fill up completely (requires resize)
- Suffers from clustering problems (primary, secondary)
- Deletion requires tombstones to preserve probe sequences
When load factor exceeds the threshold (typically 0.75):
- A new bucket array of double the current capacity is allocated
- Every existing key-value pair is rehashed and inserted into the new table
- Rehashing is necessary because
index = hash(key) % old_capacitychanges when capacity changes
Resize is amortized O(1) because the occasional O(n) resize cost is spread across the n operations that caused it. Doubling capacity ensures that between resizes, at least n/2 insertions occur, each paying a small constant "share" of the resize cost.
A good hash function has these key properties:
- Deterministic: Same input always produces the same hash value
- Uniform distribution: Keys are spread evenly across the entire hash space
- Fast to compute: Hash computation overhead should not dominate operation cost
- Avalanche effect: Small changes in input (one bit flip) produce significantly different hash outputs
- Non-invertible: Given a hash value, it should be hard to determine the original key
For security-sensitive applications, the function should also be non-deterministic to attackers (using a random seed or secret salt).
A custom object must satisfy two requirements:
- Implement
__hash__()method returning an integer - Implement
__eq__()method for equality comparison
Critical invariants:
- If
a == bis True, thenhash(a) == hash(b)must also be True - The hash value should be based on immutable fields only — if an object's hash changes while it is stored in a set, it becomes impossible to retrieve
- Use immutable fields for hashing: strings, numbers, tuples of immutable types
A hash set stores only keys (unique elements) with no associated values — it is used for membership testing and set operations (union, intersection, difference).
A hash map (dict) stores key-value pairs — each key maps to an associated value, enabling lookup by key, frequency counting, caching, and associative array semantics.
Implementation-wise, they share the same underlying hash table structure. A hash set can be seen as a hash map where values are ignored or set to a sentinel value (e.g., True or None).
Key differences:
- Collision resolution: Java uses separate chaining (linked list → red-black tree when chain length > 8); Python uses open addressing
- Initial capacity: Java defaults to 16 (power of 2); Python chooses a prime-based size automatically
- Load factor: Both default to 0.75
- Null keys: Java allows one null key (hash = 0); Python
dictallows None as a key (it is hashable) - Insertion order: Python dicts maintain insertion order (since 3.7); Java HashMap does not guarantee order
- Treeification: Java converts long chains to balanced trees (O(log n) worst-case); Python does not
HashDoS is an algorithmic complexity attack where an attacker submits many keys that all hash to the same bucket, degrading hash table operations from O(1) to O(n). For web frameworks parsing query parameters into hash tables, this can bring down the server.
Mitigation strategies:
- Randomized hash functions: SipHash uses a secret key per-process, making collision prediction infeasible
- Key length limits: Restrict the maximum size of keys that can be submitted
- Rate limiting: Limit request sizes and parameter counts
- Treeification: Convert long chains to balanced trees (used in Java 8+)
- Monitoring: Alert on unusually deep buckets or degraded lookup latencies
Double hashing uses a second hash function to determine the probe step size: index = (hash1(key) + i * hash2(key)) % table_size. This ensures each key follows a unique probe sequence.
Advantages over linear probing:
- Eliminates primary clustering (long runs of occupied slots)
- Probe sequences differ for different keys that share the same initial bucket
Advantages over quadratic probing:
- Eliminates secondary clustering (same initial bucket → same probe sequence)
- Does not suffer from the "probe cycles" that can leave empty slots unreachable
Disadvantage: Computing a second hash function adds computational overhead, and hash2 must never produce 0 (which would cause infinite probing).
The average-case O(1) relies on the hash function distributing keys uniformly across buckets (simple uniform hashing assumption). Under this assumption, each bucket gets roughly n/m entries (the load factor α).
With separate chaining, finding an entry in a bucket requires traversing α elements on average — which is O(1) because α is kept constant (e.g., ≤ 0.75) via resizing.
The O(n) worst case occurs when all keys hash to the same bucket, turning the table into a single linked list. This can happen with:
- A degenerate (constant) hash function
- Malicious input crafted to exploit a known hash function
- All keys being identical or having identical hash codes
There are several design approaches with different trade-offs:
- Coarse-grained locking: One lock for the entire table. Simple but poor concurrency (sequentializes all access).
- Striped / segment locking: Divide the table into independently locked segments (Java's ConcurrentHashMap before Java 8). Multiple threads can access different segments concurrently.
- Lock-free / CAS-based: Use atomic compare-and-swap operations (Java's ConcurrentHashMap in Java 8+, using synchronized on individual buckets and red-black trees for long chains).
- Read-copy-update (RCU): Readers access without locks; writers create a copy, modify, then atomically swap. Excellent for read-heavy workloads.
Cuckoo hashing uses two (or more) hash functions, each mapping to a separate table. On insertion:
- Check both possible positions using hash1(key) and hash2(key)
- If one is empty, place the key there
- If both are occupied, evict (displace) one occupant to its alternate position
- The evicted key then attempts its other position, potentially displacing another key
- If a cycle is detected (too many evictions), the tables are rehashed with new hash functions
The main advantage is O(1) worst-case lookup — you only need to check at most k positions. This is critical for real-time systems and latency-sensitive applications where the O(n) worst case of traditional hash tables is unacceptable.
Python's dict and set resize by:
- Allocating a new array of entries (2x current capacity, rounded to a power of 2 above a minimum)
- Recomputing the index for every entry using the new capacity (
index = hash(key) & (new_size - 1)) - The old array is kept alive until all references are released; this can temporarily double memory usage
- Python uses lazy resizing on PyPy-like implementations — the old table is kept as a fallback and entries are moved incrementally on access
- In CPython, resizing is an all-at-once operation — this is why pre-sizing dicts with known capacity avoids resize latency spikes
The contract between hashCode() and equals() in Java:
- If two objects are equal by
equals(), they must have the samehashCode() - If two objects have the same
hashCode(), they may or may not be equal (hash collision) hashCode()should be consistent — return the same value across multiple invocations within the same program execution
When an object is used as a HashMap key, Java first checks the hash code to find the bucket, then uses equals() to find the exact key within that bucket. Violating this contract (e.g., implementing equals() without hashCode()) means lookups will silently fail to find keys.
Robin Hood hashing is an open addressing variant that reduces probe length variance by enforcing a fairness policy:
- Each entry tracks its "probe distance" (how far from its ideal bucket it ended up)
- When inserting a new entry, if its probe distance exceeds that of an existing entry it encounters during probing, it swaps places with the richer entry
- The displaced entry continues probing from that position
This has two key benefits:
- Probe lengths become more uniform — the maximum probe length is bounded by O(log n) instead of potentially O(n)
- Average and worst-case performance is more predictable, which matters for real-time and latency-sensitive applications
Further Reading
Hash tables are a deep topic with rich theoretical foundations and many specialized variants. Below are curated resources for deeper exploration.
Core Theory
- Introduction to Algorithms (CLRS), Chapters 11-13 — The definitive reference on hash table theory, covering hash functions, collision resolution, universal hashing, and perfect hashing.
- The Art of Computer Programming, Vol. 3 (Knuth), Section 6.4 — Knuth’s deep treatment of hashing, including analysis of probing sequences and expected performance.
- “Universal Classes of Hash Functions” (Carter & Wegman, 1979) — The foundational paper on universal hashing, proving that randomized hash families reduce collision probability.
Advanced Variants
- Cuckoo Hashing (Pagh & Rodler, 2004) — A scheme using two hash functions with O(1) worst-case lookup. Items are displaced (“cuckooed”) to alternate positions on collision.
- Robin Hood Hashing — Open addressing variant that reduces variance in probe lengths by “stealing” slots from richer neighbors, giving more predictable performance.
- Hopscotch Hashing (Herlihy, Shavit & Tzafrir, 2008) — Cache-friendly concurrent hash table design with bounded probe lengths.
- Hash Array Mapped Tries (HAMT) — Persistent, immutable hash tables used in functional programming languages (Clojure, Scala, Elixir) that provide efficient structural sharing.
Production & Language-Specific
- Python’s dict implementation (PyDictObject) — PEP 456 describes the SipHash-based randomized hashing adopted in Python 3.4+. The CPython source is a well-commented reference.
- Java’s HashMap (java.util.HashMap) — Uses separate chaining with treeification (converts linked list to red-black tree when chain length exceeds 8) to guarantee O(log n) worst-case for hash collisions.
- Go’s map implementation — Uses an unusual bucket-based approach with key-value pairs stored in a contiguous array per bucket, optimized for cache performance.
- Redis hash tables — Incremental rehashing with two tables simultaneously during resize to avoid latency spikes.
Performance & Security
- SipHash: A Fast Short-Input PRF (Aumasson & Bernstein, 2012) — The hash function designed to prevent HashDoS attacks, now used in Python, Rust, Ruby, and many other languages.
- “Understanding Hash Table Performance” (Malte Skarupke) — A practical blog post with benchmarks comparing various hash table implementations across real-world workloads.
- Google’s SwissTable / Abseil flat_hash_map — A high-performance open-addressing design that uses metadata bytes (SSE-optimized) for fast probing. Used internally at Google and widely adopted.
Conclusion
Hash tables provide O(1) average-case insertion, lookup, and deletion by mapping keys to array indices via a hash function, with collisions resolved through separate chaining or open addressing. Load factor (items/capacity) triggers resize when exceeded—typically at 0.7-0.75—causing occasional O(n) rehashing but keeping amortized operations O(1). Separate chaining handles collisions with linked lists per bucket; open addressing finds alternative slots via probing but suffers clustering. Python dicts maintain insertion order as an implementation detail and use open addressing with a hybrid probing scheme.
Category
Related Posts
Arrays vs Linked Lists: Understanding the Trade-offs
Compare arrays and linked lists in terms of access time, insertion/deletion efficiency, memory usage, and cache performance.
Arrays: 1D, 2D, and Multi-dimensional Mastery
Master array operations, traversal, search, common patterns like two-pointer and sliding window, and when to use multi-dimensional arrays.
AVL Trees: Self-Balancing Binary Search Trees
Master AVL tree rotations, balance factors, and rebalancing logic. Learn when to use AVL vs Red-Black trees for your use case.