CRDTs: Conflict-Free Replicated Data Types
Learn how CRDTs enable strongly consistent eventual consistency in distributed databases. Explore G-Counters, PN-Counters, LWW-Registers, OR-Sets, and practical applications.
Introduction
Traditional conflict resolution requires either:
- Coordination: A central node sequences all writes. This is a bottleneck and a single point of failure.
- Application-level resolution: The application developer writes complex merge logic. This is error-prone.
- Last-write-wins: The timestamp decides. This loses data and produces unexpected results.
CRDTs offer a fourth option: the data structure itself guarantees conflict-free merging.
graph TD
subgraph Without CRDT
N1[Node A] --> W1[Write: x=1]
N2[Node B] --> W2[Write: x=2]
M1[Merge] -->|conflict| X[???]
end
subgraph With CRDT
N3[Node A] --> W3[Write: x=1]
N4[Node B] --> W4[Write: x=2]
M2[Merge] --> R[Both values preserved]
end
Core Concepts
A CRDT is a data type where any two valid states can be merged and the result is also a valid state. This property is called commutativity: the order of merging does not matter.
Mathematically, CRDTs come in two flavors:
- CmRDT (Commutative Replicated Data Types): Operations commute
- CvRDT (Convergent Replicated Data Types): States converge when merged
In practice, most implementations use CvRDTs because they are easier to reason about. The state itself is designed to merge cleanly.
G-Counter: Grow-Only Counter
The simplest CRDT. It only increments. No decrements. This works for things like page view counts where you never need to subtract.
class GCounter:
"""
Grow-only counter. Each node maintains its own count.
Merging takes the maximum for each node.
"""
def __init__(self):
self.counts = {} # node_id -> count
def increment(self, node_id):
self.counts[node_id] = self.counts.get(node_id, 0) + 1
def merge(self, other: 'GCounter'):
for node_id, count in other.counts.items():
self.counts[node_id] = max(
self.counts.get(node_id, 0),
count
)
def value(self) -> int:
return sum(self.counts.values())
graph TD
N1[Node A: 5] --> M[Merge]
N2[Node B: 3] --> M
M --> R[Total: 8]
PN-Counter: Positive-Negative Counter
A PN-Counter extends G-Counter to support both increments and decrements. It maintains two G-Counters: one for increments, one for decrements.
class PNCounter:
"""
Positive-Negative counter. Supports both increments and decrements.
Uses two G-Counters internally.
"""
def __init__(self):
self.positive = GCounter() # increments
self.negative = GCounter() # decrements
def increment(self, node_id, amount=1):
for _ in range(amount):
self.positive.increment(node_id)
def decrement(self, node_id, amount=1):
for _ in range(amount):
self.negative.increment(node_id)
def merge(self, other: 'PNCounter'):
self.positive.merge(other.positive)
self.negative.merge(other.negative)
def value(self) -> int:
return self.positive.value() - self.negative.value()
A use case: a shopping cart where you add and remove items. The PN-Counter tracks the cart size even when concurrent updates happen.
LWW-Register: Last-Write-Wins Register
A register holds a single value. When merging, the write with the latest timestamp wins.
import time
from dataclasses import dataclass
@dataclass
class LWWRegister:
value: any
timestamp: float
node_id: str
class LWWMap:
"""
Last-Write-Wins map. Each key has a (value, timestamp, node_id) tuple.
On merge, the entry with highest timestamp wins.
Ties broken by node_id.
"""
def __init__(self):
self.entries = {} # key -> LWWRegister
def set(self, key, value, timestamp=None, node_id=None):
self.entries[key] = LWWRegister(
value=value,
timestamp=timestamp or time.time(),
node_id=node_id
)
def get(self, key):
return self.entries.get(key)
def merge(self, other: 'LWWMap'):
for key, reg in other.entries.items():
if key not in self.entries:
self.entries[key] = reg
else:
# Compare timestamps, then node_id for ties
ours = self.entries[key]
if (reg.timestamp > ours.timestamp or
(reg.timestamp == ours.timestamp and reg.node_id > ours.node_id)):
self.entries[key] = reg
sequenceDiagram
participant N1 as Node A
participant N2 as Node B
participant M as Merge
N1->>N1: set(x=1, ts=100)
N2->>N2: set(x=2, ts=101)
N1->>M: state: x=(1, 100, A)
N2->>M: state: x=(2, 101, B)
M->>M: ts=101 > ts=100, keep B
Note over M: Result: x=2
OR-Set: Observed-Remove Set
An OR-Set lets you add and remove elements. Concurrent adds and removes are handled correctly.
class ORObject:
"""
Observed-Remove Set entry.
Each (tag, value) pair can be added and removed.
"""
def __init__(self, value, tag):
self.value = value
self.tag = tag
class ORSet:
"""
Observed-Remove Set. Elements have unique tags.
- add(tag, value): Adds element with unique tag
- remove(tag): Marks element as removed
- merge(): Union of adds, subtracts removes
"""
def __init__(self):
self.adds = {} # tag -> ORObject
self.removes = set() # tags that were removed
def add(self, value, tag=None):
tag = tag or str(uuid.uuid4())
self.adds[tag] = ORObject(value, tag)
return tag
def remove(self, tag):
self.removes.add(tag)
def contains(self, tag):
return tag in self.adds and tag not in self.removes
def merge(self, other: 'ORSet'):
# Union of adds
for tag, obj in other.adds.items():
if tag not in self.removes:
self.adds[tag] = obj
# Union of removes
self.removes.update(other.removes)
def elements(self):
return [obj.value for tag, obj in self.adds.items()
if tag not in self.removes]
OR-Sets are good for collaborative lists, shopping carts, and anywhere items are added and removed concurrently.
Tombstone accumulation is a practical issue with basic OR-Set implementations. When you remove an element, the tag lands in the removes set but never gets cleaned up. Run enough add/remove cycles and the removes set grows without bound. Picture this: add and remove a user 100 times across 5 nodes, and you now have 500 tombstone entries even though the set looks empty to anyone reading it. This is especially painful in memory-constrained environments. RGA and other advanced variants address this differently, using garbage collection or mergeable delete flags instead of permanent tombstones.
RGA: Replicated Growable Array
The RGA is a CRDT for ordered sequences. It handles concurrent inserts at the same position correctly.
class RGANode:
def __init__(self, value, timestamp, id=None):
self.value = value
self.timestamp = timestamp
self.id = id or str(uuid.uuid4())
class RGA:
"""
Replicated Growable Array. Uses timestamps and IDs
for ordering. Concurrent inserts at the same position
are ordered by ID to ensure determinism.
"""
def __init__(self):
self.nodes = [] # Ordered list of RGANodes
def insert_after(self, after_id, value):
node = RGANode(value, time.time())
# Insert at position after after_id
pos = self._find_position(after_id)
self.nodes.insert(pos + 1, node)
def _find_position(self, after_id):
for i, n in enumerate(self.nodes):
if n.id == after_id:
return i
return -1 # Insert at beginning if not found
def merge(self, other: 'RGA'):
# Merge by timestamp+ID ordering
all_nodes = {n.id: n for n in self.nodes}
all_nodes.update({n.id: n for n in other.nodes})
# Re-sort by timestamp, then by ID
self.nodes = sorted(
all_nodes.values(),
key=lambda n: (n.timestamp, n.id)
)
2P-Set: Add-Only-Then-Remove-Once
A Two-Phase Set (2P-Set) is stricter than an OR-Set: you can add freely, but once something is removed it cannot be re-added. Think feature flags, configuration keys, or any domain where deprecation is permanent.
The two phases are straightforward. During the add phase, elements land in AddSet. During the remove phase, they move to RemoveSet (a tombstone). Merge unions both sets. An element is present only when it is in AddSet and not in RemoveSet.
class TwoPhaseSet:
"""2P-Set: adds go to AddSet, removes go to RemoveSet. Once removed, never returns."""
def __init__(self):
self.add_set = set() # Elements that have been added
self.remove_set = set() # Elements that have been removed (tombstones)
def add(self, value):
if value not in self.remove_set:
self.add_set.add(value)
def remove(self, value):
self.remove_set.add(value)
def contains(self, value):
return value in self.add_set and value not in self.remove_set
def merge(self, other: 'TwoPhaseSet'):
self.add_set = self.add_set.union(other.add_set)
self.remove_set = self.remove_set.union(other.remove_set)
Good fit: versioned config where you deprecate keys and never re-enable them. Roll out a feature flag, then kill it permanently. A 2P-Set handles this correctly even if someone tries to add the flag again concurrently.
Cart-Document CRDT: Shopping Cart Without Conflicts
Shopping carts have to deal with concurrent adds, removes, and quantity updates from multiple devices hitting the same cart. A Cart-Document CRDT models this as a document of items, each with a product ID, quantity, and timestamp.
from dataclasses import dataclass
import time
@dataclass
class CartItem:
product_id: str
quantity: int
timestamp: float
class CartDocument:
"""Shopping cart as a CRDT document. Concurrent adds/updates/remove all merge cleanly."""
def __init__(self):
self.items = {} # product_id -> CartItem
def add_item(self, product_id: str, quantity: int, timestamp: float = None):
timestamp = timestamp or time.time()
if product_id not in self.items:
self.items[product_id] = CartItem(product_id, quantity, timestamp)
elif timestamp > self.items[product_id].timestamp:
self.items[product_id] = CartItem(product_id, quantity, timestamp)
def remove_item(self, product_id: str, timestamp: float = None):
timestamp = timestamp or time.time()
if product_id in self.items:
if timestamp > self.items[product_id].timestamp:
self.items[product_id] = CartItem(product_id, 0, timestamp)
def update_quantity(self, product_id: str, quantity: int, timestamp: float = None):
timestamp = timestamp or time.time()
if product_id in self.items and timestamp > self.items[product_id].timestamp:
self.items[product_id] = CartItem(product_id, quantity, timestamp)
elif product_id not in self.items:
self.items[product_id] = CartItem(product_id, quantity, timestamp)
def merge(self, other: 'CartDocument'):
for product_id, other_item in other.items.items():
if product_id not in self.items:
self.items[product_id] = other_item
else:
our_item = self.items[product_id]
# Take max quantity per product so neither addition gets lost
merged_quantity = max(our_item.quantity, other_item.quantity)
# Latest timestamp wins for everything else
if other_item.timestamp > our_item.timestamp:
self.items[product_id] = CartItem(
product_id,
merged_quantity,
other_item.timestamp
)
else:
self.items[product_id] = CartItem(
product_id,
merged_quantity,
our_item.timestamp
)
def get_items(self):
return [item for item in self.items.values() if item.quantity > 0]
The merge takes the max quantity per product_id so you never lose an item that was added on another device. Everything else uses timestamp-based last-write-wins.
JSON CRDT: Nested Structures Without Conflicts
Apps need to sync nested JSON-like structures: maps with string keys, arrays with ordered elements, scalars. Automerge and Yjs handle this by treating every operation (set, insert, delete) as a timestamped, actor-tagged unit that merges deterministically.
Automerge stores documents as operation trees. Each op carries a timestamp and actor ID. When merging, it computes the transitive closure of all ops and applies them in deterministic order (actor ID breaks ties).
Yjs takes a similar path, giving every array element and map key a unique ID. Concurrent changes resolve like this:
- Scalar values (strings, numbers, booleans): latest timestamp wins
- Map keys: latest timestamp wins per key
- Array elements: inserts get unique IDs; concurrent inserts at the same spot ordered by ID
- Nested objects: recursive merge, where each sub-object is itself a CRDT
class JSONScalar:
"""A scalar value with LWW semantics."""
def __init__(self, value, timestamp, actor):
self.value = value
self.timestamp = timestamp
self.actor = actor
def merge(self, other):
if other.timestamp > self.timestamp:
return other
elif other.timestamp == self.timestamp and other.actor > self.actor:
return other
return self
class JSONMap:
"""A JSON object/map with LWW semantics per key."""
def __init__(self):
self.fields = {} # key -> JSONValue
def set(self, key, value, timestamp, actor):
self.fields[key] = JSONScalar(value, timestamp, actor)
def get(self, key):
return self.fields.get(key)
def merge(self, other: 'JSONMap'):
for key, other_val in other.fields.items():
if key not in self.fields:
self.fields[key] = other_val
else:
self.fields[key] = self.fields[key].merge(other_val)
class JSONArray:
"""A JSON array with CRDT semantics for concurrent inserts."""
def __init__(self):
self.elements = [] # List of (id, JSONValue)
def insert(self, id, value, timestamp, actor):
self.elements.append((id, JSONScalar(value, timestamp, actor)))
def merge(self, other: 'JSONArray'):
# Union of elements, sorted by (timestamp, actor) for determinism
other_map = {e[0]: e[1] for e in other.elements}
self_map = {e[0]: e[1] for e in self.elements}
self_map.update(other_map)
self.elements = sorted(
[(id, val) for id, val in self_map.items()],
key=lambda x: (x[1].timestamp, x[1].actor)
)
Both Automerge and Yjs use structural sharing to avoid copying entire documents on every change. When you modify a deeply nested field, only the path to that field is copied; sibling branches are shared with the previous version. This makes CRDT documents efficient even when histories grow long.
Where CRDTs Are Used
Collaborative Editing
Figma uses CRDTs for real-time collaboration. Each edit is an operation that can be merged without conflicts. Multiple users can edit the same document simultaneously.
Distributed Databases
Riak uses CRDTs for certain data types. Cassandra’s lightweight transactions use a form of CRDT-like conflict resolution.
Messaging Systems
WhatsApp uses CRDTs for message synchronization. When you send a message offline, it is stored locally. When connectivity returns, it is merged with the server state without conflicts.
Shopping Carts
Amazon’s shopping cart reportedly uses CRDTs. Your cart is replicated across nodes. Concurrent modifications (add from phone, remove from desktop) merge correctly.
Implementing CRDTs in Practice
Most languages have CRDT libraries. For Python:
# Using the crdt library
from crdt import GCounter, PN_Counter, LWW_Register, OR_Map
# Create counters on different nodes
node_a_counter = PN_Counter()
node_b_counter = PN_Counter()
# Each node increments independently
node_a_counter.increment('node_a')
node_b_counter.increment('node_b')
# Merge when nodes communicate
node_a_counter.merge(node_b_counter)
# Result is the sum of all increments
print(node_a_counter.value)
Production Failure Scenarios
Data Integrity Issues
Tombstone Explosion in Production
A team deployed an OR-Set-based feature flag system at scale. Each A/B test toggle created new tags on add and accumulated tombstones on remove. After two years of continuous experimentation with thousands of flags being added and removed daily, the tombstone set grew to 8GB per replica. Memory-constrained edge nodes began crashing. The fix required implementing a generation-based garbage collection scheme with sliding-window compaction. Lesson: OR-Set tombstones must be bounded before production deployment at scale.
Clock Skew Causing LWW Data Loss
A distributed cache used LWW-Register for user session tokens. When a datacenter’s NTP server misbehaved for 30 minutes, clocks skewed forward by 5 minutes. Sessions updated during that window had inflated timestamps. After the skew was corrected, merges preferred the pre-skew timestamps, causing tokens to revert to old states. Users were logged out unexpectedly. The team switched to Hybrid Logical Clocks (HLC) and added timestamp sanity checks. Lesson: never trust wall-clock timestamps alone in distributed systems.
Merge-order Sensitivity with Malformed Input
An implementation of RGA allowed inserting after a non-existent ID (inserting at head without explicit marker). Two nodes independently inserted at “head” position concurrently, creating ordering ambiguity. When merged, the elements appeared in different orders on different replicas, causing UI divergence. The bug was in the positioning logic, not the CRDT itself. Lesson: CRDT correctness depends on correct implementation of the positioning/ordering logic.
Semantic and Application-Level Violations
Concurrent Decrement on PN-Counter Causing Negative Values
A banking application used PN-Counter for balance tracking. Due to network latency, two concurrent decrements both read the same positive balance before either wrote. Both decrements applied successfully, resulting in a negative balance that violated business rules. The CRDT merge was correct, but the application logic assumed non-negative intermediate states. Lesson: CRDTs do not protect against semantic violations; application-level invariants must be validated post-merge.
JSON CRDT Document Bloat in Long-running Sessions
A collaborative editing app using Automerge accumulated 50MB of operation history per session after 6 months of heavy editing. The document would not sync within reasonable time on mobile devices. Users had to export/import to reset state. The team had to implement history truncation policies with server-side snapshots. Lesson: CRDT documents need lifecycle management (snapshots, GC) for long-running collaborative sessions.
Partition Causing Quantity Disputes
During a flash sale, a shopping cart using Cart-Document CRDT experienced a 2-minute network partition. Three devices for the same user each added the same item concurrently. The PN-Counter portion tracked total quantity correctly, but the LWW timestamp field was ambiguous because all three writes happened at effectively the same time from different devices. Resolution required application-level “highest quantity wins” logic on top of the CRDT merge. Lesson: complex business rules on top of CRDTs may still need conflict resolution logic.
Common Pitfalls / Anti-Patterns
CRDTs are powerful but not a silver bullet.
Memory overhead: CRDTs maintain metadata (timestamps, tags, per-node state). A G-Counter with 1000 nodes needs 1000 counters even if most are zero.
Not all data types have CRDT forms: You cannot build a CRDT for a set with strong consistency guarantees on contains(). Some data just does not have a conflict-free representation.
Semantic limitations: LWW-Register loses data by design. If two concurrent writes happen, one value disappears. CRDTs do not solve the problem of data loss; they just make the loss predictable.
# CRDT does not preserve both values
node_a.set('x', 'hello')
node_b.set('x', 'world')
# After merge: only one survives
# Which one depends on timestamps, not intent
Timestamps are unreliable: Clocks skew across machines. Hybrid Logical Clocks (HLC) help, but timestamps are never perfect.
CRDTs vs Operational Transformation
Both solve collaborative editing, but differently.
| Aspect | CRDT | Operational Transformation |
|---|---|---|
| Coordination | None | Required for some transforms |
| Complexity | Simple data structures | Complex transformation functions |
| Server dependency | Not needed | Often needed for correctness |
| Memory | Grows with history | Can be compact |
For new systems, CRDTs are usually the better choice. They are simpler and do not require a central server.
CRDT Type Comparison
| CRDT Type | Operations | Merge Strategy | Memory Overhead | Use Cases |
|---|---|---|---|---|
| G-Counter | Increment only | Max per node | O(N) counters | Page views, vote counts |
| PN-Counter | Increment, Decrement | Max for pos/neg | O(2N) counters | Shopping cart qty, bank balance |
| LWW-Register | Set | Latest timestamp wins | O(1) | User preferences, last-write-wins |
| OR-Set | Add, Remove | Union of adds, subtracts removes | O(A+R) entries | Shopping carts, collaborative lists |
| RGA | Insert, Delete | Timestamp+ID ordering | O(N) nodes | Text editing, ordered sequences |
Production Libraries
Most languages have CRDT library support:
# Python: pycrdt library
from pycrdt import Counter, LwwMap, ORMap
# Create a shared document
doc = pycrdt.Document()
# Add CRDTs to the document
counter = doc.get_or_create('counter', pycrdt.Int())
counter.increment()
// JavaScript/TypeScript: Yjs library
import * as Y from "yjs";
const doc = new Y.Doc();
const text = doc.getText("editor");
const map = doc.getMap("preferences");
// Automatic conflict resolution
text.insert(0, "Hello");
| Library | Language | Best For |
|---|---|---|
| Yjs | JavaScript/TypeScript | Collaborative text editing |
| Automerge | JavaScript, Rust, Python | General-purpose CRDT documents |
| pycrdt | Python | Python-based systems |
| riak_dt | Erlang | Distributed databases |
| crdt | Ruby | Ruby-based systems |
Quick Recap Checklist
- CRDTs merge concurrent updates without conflicts
- G-Counter: increment-only counting
- PN-Counter: bidirectional counting
- LWW-Register: timestamp-based value selection
- OR-Set: add/remove collections
- CRDTs trade memory and semantics for coordination-free merging
- Not all data types have CRDT representations
- Use for: collaborative editing, distributed caches, offline-first apps
For more on eventual consistency, see Event-Driven Architecture. For conflict resolution patterns, see Consistency Models. For distributed data patterns, see Database Replication.
CRDTs are a powerful tool for building eventually consistent systems without coordination overhead. They are not right for every problem, but when you need multiple nodes to accept writes independently and merge cleanly, CRDTs provide mathematical guarantees that hand-rolled merge logic cannot match.
Interview Questions
Expected answer points:
- CRDT stands for Conflict-Free Replicated Data Type
- Enables strongly consistent eventual consistency without coordination
- Solves the conflict problem when multiple nodes accept writes concurrently and then synchronize
- Mathematically designed to merge concurrent updates without conflicts
- No central node required to sequence writes or resolve conflicts
Expected answer points:
- CmRDT (Commutative Replicated Data Types): Operations themselves commute, meaning the order of operations does not matter
- CvRDT (Convergent Replicated Data Types): States converge when merged, regardless of merge order
- CvRDTs are more commonly used in practice because they are easier to reason about
- CvRDTs work by having state that merges cleanly; CmRDTs require operation logs to replay
Expected answer points:
- G-Counter is a grow-only counter that only increments, never decrements
- Each node maintains its own count mapped by node_id
- Merging takes the maximum value for each node across all replicas
- The final value is the sum of all per-node counters
- Limitation: Cannot decrement, only works for monotonic counters like page views
- Memory overhead is O(N) where N is the number of nodes
Expected answer points:
- PN-Counter extends G-Counter to support both increments and decrements
- Internally maintains two G-Counters: one for positive (increments) and one for negative (decrements)
- Merge operation merges both counters separately using max strategy
- Final value is positive counter minus negative counter
- Use cases: shopping cart quantities, bank balances, any counter that needs to go up and down
Expected answer points:
- LWW-Register holds a single value with an associated timestamp and node_id
- When merging, the write with the latest timestamp wins
- Ties are broken by node_id (higher ID wins)
- Simple to implement but loses data by design when concurrent writes occur
- Does not preserve both values in conflict - one value disappears permanently
Expected answer points:
- OR-Set allows adding and removing elements with unique tags
- Each (tag, value) pair can be added and removed independently
- Maintains two sets: adds (tag -> object mapping) and removes (set of removed tags)
- Merge is union of adds minus any tags in removes
- Concurrent adds of same value with different tags are all preserved
- Concurrent add and remove of same tag results in removal winning
Expected answer points:
- Tombstones are tags stored in the removes set when elements are deleted
- These tombstone entries never get cleaned up in basic OR-Set implementations
- After many add/remove cycles, the removes set grows without bound
- Example: add/remove a user 100 times across 5 nodes = 500 tombstone entries
- Especially painful in memory-constrained environments
- Advanced variants like RGA use garbage collection or mergeable delete flags instead
Expected answer points:
- RGA is a CRDT for ordered sequences like text or lists
- Each element has a timestamp and unique ID
- Insert operations specify a position using the ID of the preceding element
- Concurrent inserts at the same position are ordered by ID to ensure determinism
- Merge combines all nodes from both replicas and re-sorts by timestamp then ID
- Solves the concurrent edit problem in collaborative text editing
Expected answer points:
- CRDTs require no coordination; OT often requires a central server for correctness
- CRDTs use simple data structures; OT uses complex transformation functions
- CRDTs do not need a server; OT systems often depend on one
- CRDT memory grows with history; OT can be more compact
- CRDTs are usually the better choice for new systems due to simplicity
- OT was historically used by Google Docs; CRDTs used by Figma and others
Expected answer points:
- Memory overhead: CRDTs maintain metadata (timestamps, tags, per-node state) - a G-Counter with 1000 nodes needs 1000 counters
- Not all data types have CRDT forms: cannot build a CRDT for a set with strong consistency contains() guarantees
- LWW-Register loses data by design: concurrent writes mean one value disappears
- Timestamps are unreliable: clocks skew across machines; Hybrid Logical Clocks help but are not perfect
- CRDTs do not solve data loss - they make it predictable but still present
Expected answer points:
- 2P-Set is stricter than OR-Set: you can add freely, but once removed, an element cannot be re-added
- Two phases: add phase (elements go to AddSet) and remove phase (elements go to RemoveSet/tombstones)
- Element is present only when in AddSet and not in RemoveSet
- Good fit: versioned configuration where deprecation is permanent
- Use cases: feature flags, configuration keys, any domain where items are deprecated forever
Expected answer points:
- Physical clocks (NTP-synced) can skew or jump, making timestamps unreliable
- HLC combines physical time with logical counters to create causally consistent timestamps
- HLC guarantees that if event A causally precedes event B, then HLC(A) < HLC(B)
- Even when physical clocks diverge, the logical component maintains ordering
- Useful for LWW-Register and other timestamp-based CRDTs to ensure deterministic merges
- Does not eliminate clock issues entirely but provides bounded guarantee
Expected answer points:
- Automerge stores documents as operation trees with timestamp and actor ID per op
- When merging, computes transitive closure of all ops and applies in deterministic order
- Actor ID breaks ties when timestamps are equal
- Uses structural sharing to avoid copying entire documents on every change
- Only the path to a modified field is copied; sibling branches are shared
- Makes CRDT documents efficient even when histories grow long
Expected answer points:
- Models shopping cart as a document with items, each having product_id, quantity, and timestamp
- Supports concurrent adds, removes, and quantity updates from multiple devices
- Merge takes max quantity per product_id so additions from different devices are preserved
- Other fields use timestamp-based last-write-wins
- Handles the case where you add from phone and remove from desktop correctly
Expected answer points:
- Figma: real-time collaboration with CRDTs for each edit that merges without conflicts
- Riak: distributed database using CRDTs for certain data types
- Cassandra: lightweight transactions use CRDT-like conflict resolution
- WhatsApp: message synchronization for offline messages that merge without conflicts
- Amazon shopping cart: reportedly uses CRDTs for replication across nodes
- Yjs: JavaScript CRDT library used by many collaborative apps
Expected answer points:
- Vector clocks track causality by maintaining a vector of counters (one per node)
- Vector clocks detect conflicts but do not resolve them - they just identify concurrent events
- CRDTs are data structures with built-in merge semantics that resolve conflicts automatically
- Use vector clocks when you need to detect causality and make application-specific merge decisions
- Use CRDTs when you want automatic conflict-free merging without application logic
- Some systems combine both: vector clocks for causality tracking, CRDTs for data
Expected answer points:
- State-based CRDTs send the entire state on merge, which grows with all history
- Delta-based CRDTs only send the delta (change) since last synchronization
- Delta mutation (m) produces delta state d such that state + d = new state
- Reduces network bandwidth significantly for large CRDTs with long histories
- Requires maintaining delta logs to compute deltas from full state
- Commonly used in mobile/offline-first applications where bandwidth is constrained
Expected answer points:
- Eventual consistency guarantees that if no new updates are made, all replicas will eventually converge
- CRDTs provide strong eventual consistency (SEC): convergence plus the guarantee that any two valid merges produce the same state
- Traditional eventual consistency requires conflict resolution logic; CRDTs make conflicts mathematically impossible
- CRDTs enable coordination-free updates: any node can accept writes without waiting for others
- The "strong" in SEC means the convergence property holds regardless of merge order
Expected answer points:
- CRDTs handle partitions gracefully: any node can continue accepting writes
- When partitions heal, replicas merge using CRDT semantics - no data is lost (except LWW which may lose)
- Trade-off: divergent histories accumulate during partition, creating more merge work
- Memory grows faster during partitions because more concurrent operations accumulate
- No need for CAP theorem trade-off decisions - CRDTs are always available and eventually consistent
- However, CRDTs do not guarantee strong consistency (CP) - reads may return stale data
Expected answer points:
- Tombstones (removal markers) accumulate indefinitely in basic OR-Set and 2P-Set
- Hybrids approach: periodically compact tombstones using a generation/epoch number
- G-Counting: maintain per-element add and remove counts, GC when adds = removes for an element
- Message-stability based: GC tombstones once all known replicas have acknowledged the operation
- Sliding-window: only keep tombstones within a bounded window of recent operations
- Absence-based GC: if an element is present in all replicas and no pending operations exist, remove tombstones
Further Reading
Academic Papers
- Shapiro et al. (2011) - “A Comprehensive Study of Convergent and Commutative Replicated Data Types” - The foundational paper defining CvRDTs and CmRDTs
- Bieniusa et al. (2012) - “Brief Announcement: The Case for Fast Update Propagation in CRDTs” - Analyzes merge performance characteristics
- Bailis et al. (2014) - “Coordination Avoidance in Database Systems” - Relates CRDT concepts to database consistency
Books
- “Understanding Distributed Systems” by Roberto Vitillo - Chapter on consistency models covers CRDTs
- “Designing Data-Intensive Applications” by Martin Kleppmann - Chapter on distributed data includes CRDT comparison
Libraries and Implementations
- Yjs (JavaScript/TypeScript) - https://github.com/yjs/yjs
- Automerge (Rust, JavaScript, Python) - https://github.com/automerge/automerge
- pycrdt (Python) - https://github.com/python-pycrdt/pycrdt
- riak_dt (Erlang) - https://github.com/basho/riak_dt
Specification
- CRDT Standards Track - IETF draft on CRDT specifications for interoperability
Related Topics
- Hybrid Logical Clocks (HLC) - for more reliable timestamp ordering
- Vector Clocks - for causality tracking without automatic merge
- Operational Transformation (OT) - alternative approach used by Google Docs
- Event Sourcing - patterns for CRDT state management
Conclusion
CRDTs represent a powerful mathematical framework for building distributed systems where eventual consistency is acceptable. They eliminate the need for coordination by encoding merge semantics directly into the data structure itself, making them ideal for collaborative applications, offline-first systems, and distributed databases that must remain available during network partitions.
The choice of CRDT type depends on your operation semantics: use G-Counters for simple increment-only counters, PN-Counters when you need decrement capability, LWW-Registers for last-write-wins semantics, and OR-Sets or RGA for collections. For complex nested structures, Cart-Document CRDTs or JSON CRDTs provide the most flexibility at the cost of higher storage overhead.
In production, CRDTs power applications like conflict-free collaborative editing (Figma, Notion), distributed databases (Riak, AntidoteDB), messaging systems (WhatsApp, Discord), and distributed ledgers. Understanding the mathematical properties of your chosen CRDT—and its limitations—is essential for building systems that behave correctly under all network conditions.
Category
Related Posts
Asynchronous Replication: Speed and Availability at Scale
Learn how asynchronous replication works in distributed databases, including eventual consistency implications, lag monitoring, and practical use cases where speed outweighs strict consistency.
Multi-Paxos
Multi-Paxos extends basic Paxos to achieve consensus on sequences of values, enabling practical replicated state machines for distributed systems and databases.
Paxos Consensus Algorithm
Paxos is a family of consensus algorithms for achieving agreement in distributed systems despite failures, pioneered by Leslie Lamport.