CRDTs: Conflict-Free Replicated Data Types

Learn how CRDTs enable strongly consistent eventual consistency in distributed databases. Explore G-Counters, PN-Counters, LWW-Registers, OR-Sets, and practical applications.

published: reading time: 26 min read author: GeekWorkBench

Introduction

Traditional conflict resolution requires either:

  1. Coordination: A central node sequences all writes. This is a bottleneck and a single point of failure.
  2. Application-level resolution: The application developer writes complex merge logic. This is error-prone.
  3. Last-write-wins: The timestamp decides. This loses data and produces unexpected results.

CRDTs offer a fourth option: the data structure itself guarantees conflict-free merging.

graph TD
    subgraph Without CRDT
        N1[Node A] --> W1[Write: x=1]
        N2[Node B] --> W2[Write: x=2]
        M1[Merge] -->|conflict| X[???]
    end

    subgraph With CRDT
        N3[Node A] --> W3[Write: x=1]
        N4[Node B] --> W4[Write: x=2]
        M2[Merge] --> R[Both values preserved]
    end

Core Concepts

A CRDT is a data type where any two valid states can be merged and the result is also a valid state. This property is called commutativity: the order of merging does not matter.

Mathematically, CRDTs come in two flavors:

  • CmRDT (Commutative Replicated Data Types): Operations commute
  • CvRDT (Convergent Replicated Data Types): States converge when merged

In practice, most implementations use CvRDTs because they are easier to reason about. The state itself is designed to merge cleanly.

G-Counter: Grow-Only Counter

The simplest CRDT. It only increments. No decrements. This works for things like page view counts where you never need to subtract.

class GCounter:
    """
    Grow-only counter. Each node maintains its own count.
    Merging takes the maximum for each node.
    """
    def __init__(self):
        self.counts = {}  # node_id -> count

    def increment(self, node_id):
        self.counts[node_id] = self.counts.get(node_id, 0) + 1

    def merge(self, other: 'GCounter'):
        for node_id, count in other.counts.items():
            self.counts[node_id] = max(
                self.counts.get(node_id, 0),
                count
            )

    def value(self) -> int:
        return sum(self.counts.values())
graph TD
    N1[Node A: 5] --> M[Merge]
    N2[Node B: 3] --> M
    M --> R[Total: 8]

PN-Counter: Positive-Negative Counter

A PN-Counter extends G-Counter to support both increments and decrements. It maintains two G-Counters: one for increments, one for decrements.

class PNCounter:
    """
    Positive-Negative counter. Supports both increments and decrements.
    Uses two G-Counters internally.
    """
    def __init__(self):
        self.positive = GCounter()  # increments
        self.negative = GCounter()  # decrements

    def increment(self, node_id, amount=1):
        for _ in range(amount):
            self.positive.increment(node_id)

    def decrement(self, node_id, amount=1):
        for _ in range(amount):
            self.negative.increment(node_id)

    def merge(self, other: 'PNCounter'):
        self.positive.merge(other.positive)
        self.negative.merge(other.negative)

    def value(self) -> int:
        return self.positive.value() - self.negative.value()

A use case: a shopping cart where you add and remove items. The PN-Counter tracks the cart size even when concurrent updates happen.

LWW-Register: Last-Write-Wins Register

A register holds a single value. When merging, the write with the latest timestamp wins.

import time
from dataclasses import dataclass

@dataclass
class LWWRegister:
    value: any
    timestamp: float
    node_id: str

class LWWMap:
    """
    Last-Write-Wins map. Each key has a (value, timestamp, node_id) tuple.
    On merge, the entry with highest timestamp wins.
    Ties broken by node_id.
    """
    def __init__(self):
        self.entries = {}  # key -> LWWRegister

    def set(self, key, value, timestamp=None, node_id=None):
        self.entries[key] = LWWRegister(
            value=value,
            timestamp=timestamp or time.time(),
            node_id=node_id
        )

    def get(self, key):
        return self.entries.get(key)

    def merge(self, other: 'LWWMap'):
        for key, reg in other.entries.items():
            if key not in self.entries:
                self.entries[key] = reg
            else:
                # Compare timestamps, then node_id for ties
                ours = self.entries[key]
                if (reg.timestamp > ours.timestamp or
                    (reg.timestamp == ours.timestamp and reg.node_id > ours.node_id)):
                    self.entries[key] = reg
sequenceDiagram
    participant N1 as Node A
    participant N2 as Node B
    participant M as Merge

    N1->>N1: set(x=1, ts=100)
    N2->>N2: set(x=2, ts=101)
    N1->>M: state: x=(1, 100, A)
    N2->>M: state: x=(2, 101, B)
    M->>M: ts=101 > ts=100, keep B
    Note over M: Result: x=2

OR-Set: Observed-Remove Set

An OR-Set lets you add and remove elements. Concurrent adds and removes are handled correctly.

class ORObject:
    """
    Observed-Remove Set entry.
    Each (tag, value) pair can be added and removed.
    """
    def __init__(self, value, tag):
        self.value = value
        self.tag = tag

class ORSet:
    """
    Observed-Remove Set. Elements have unique tags.
    - add(tag, value): Adds element with unique tag
    - remove(tag): Marks element as removed
    - merge(): Union of adds, subtracts removes
    """
    def __init__(self):
        self.adds = {}  # tag -> ORObject
        self.removes = set()  # tags that were removed

    def add(self, value, tag=None):
        tag = tag or str(uuid.uuid4())
        self.adds[tag] = ORObject(value, tag)
        return tag

    def remove(self, tag):
        self.removes.add(tag)

    def contains(self, tag):
        return tag in self.adds and tag not in self.removes

    def merge(self, other: 'ORSet'):
        # Union of adds
        for tag, obj in other.adds.items():
            if tag not in self.removes:
                self.adds[tag] = obj

        # Union of removes
        self.removes.update(other.removes)

    def elements(self):
        return [obj.value for tag, obj in self.adds.items()
                if tag not in self.removes]

OR-Sets are good for collaborative lists, shopping carts, and anywhere items are added and removed concurrently.

Tombstone accumulation is a practical issue with basic OR-Set implementations. When you remove an element, the tag lands in the removes set but never gets cleaned up. Run enough add/remove cycles and the removes set grows without bound. Picture this: add and remove a user 100 times across 5 nodes, and you now have 500 tombstone entries even though the set looks empty to anyone reading it. This is especially painful in memory-constrained environments. RGA and other advanced variants address this differently, using garbage collection or mergeable delete flags instead of permanent tombstones.

RGA: Replicated Growable Array

The RGA is a CRDT for ordered sequences. It handles concurrent inserts at the same position correctly.

class RGANode:
    def __init__(self, value, timestamp, id=None):
        self.value = value
        self.timestamp = timestamp
        self.id = id or str(uuid.uuid4())

class RGA:
    """
    Replicated Growable Array. Uses timestamps and IDs
    for ordering. Concurrent inserts at the same position
    are ordered by ID to ensure determinism.
    """
    def __init__(self):
        self.nodes = []  # Ordered list of RGANodes

    def insert_after(self, after_id, value):
        node = RGANode(value, time.time())
        # Insert at position after after_id
        pos = self._find_position(after_id)
        self.nodes.insert(pos + 1, node)

    def _find_position(self, after_id):
        for i, n in enumerate(self.nodes):
            if n.id == after_id:
                return i
        return -1  # Insert at beginning if not found

    def merge(self, other: 'RGA'):
        # Merge by timestamp+ID ordering
        all_nodes = {n.id: n for n in self.nodes}
        all_nodes.update({n.id: n for n in other.nodes})

        # Re-sort by timestamp, then by ID
        self.nodes = sorted(
            all_nodes.values(),
            key=lambda n: (n.timestamp, n.id)
        )

2P-Set: Add-Only-Then-Remove-Once

A Two-Phase Set (2P-Set) is stricter than an OR-Set: you can add freely, but once something is removed it cannot be re-added. Think feature flags, configuration keys, or any domain where deprecation is permanent.

The two phases are straightforward. During the add phase, elements land in AddSet. During the remove phase, they move to RemoveSet (a tombstone). Merge unions both sets. An element is present only when it is in AddSet and not in RemoveSet.

class TwoPhaseSet:
    """2P-Set: adds go to AddSet, removes go to RemoveSet. Once removed, never returns."""
    def __init__(self):
        self.add_set = set()      # Elements that have been added
        self.remove_set = set()  # Elements that have been removed (tombstones)

    def add(self, value):
        if value not in self.remove_set:
            self.add_set.add(value)

    def remove(self, value):
        self.remove_set.add(value)

    def contains(self, value):
        return value in self.add_set and value not in self.remove_set

    def merge(self, other: 'TwoPhaseSet'):
        self.add_set = self.add_set.union(other.add_set)
        self.remove_set = self.remove_set.union(other.remove_set)

Good fit: versioned config where you deprecate keys and never re-enable them. Roll out a feature flag, then kill it permanently. A 2P-Set handles this correctly even if someone tries to add the flag again concurrently.

Cart-Document CRDT: Shopping Cart Without Conflicts

Shopping carts have to deal with concurrent adds, removes, and quantity updates from multiple devices hitting the same cart. A Cart-Document CRDT models this as a document of items, each with a product ID, quantity, and timestamp.

from dataclasses import dataclass
import time

@dataclass
class CartItem:
    product_id: str
    quantity: int
    timestamp: float

class CartDocument:
    """Shopping cart as a CRDT document. Concurrent adds/updates/remove all merge cleanly."""
    def __init__(self):
        self.items = {}  # product_id -> CartItem

    def add_item(self, product_id: str, quantity: int, timestamp: float = None):
        timestamp = timestamp or time.time()
        if product_id not in self.items:
            self.items[product_id] = CartItem(product_id, quantity, timestamp)
        elif timestamp > self.items[product_id].timestamp:
            self.items[product_id] = CartItem(product_id, quantity, timestamp)

    def remove_item(self, product_id: str, timestamp: float = None):
        timestamp = timestamp or time.time()
        if product_id in self.items:
            if timestamp > self.items[product_id].timestamp:
                self.items[product_id] = CartItem(product_id, 0, timestamp)

    def update_quantity(self, product_id: str, quantity: int, timestamp: float = None):
        timestamp = timestamp or time.time()
        if product_id in self.items and timestamp > self.items[product_id].timestamp:
            self.items[product_id] = CartItem(product_id, quantity, timestamp)
        elif product_id not in self.items:
            self.items[product_id] = CartItem(product_id, quantity, timestamp)

    def merge(self, other: 'CartDocument'):
        for product_id, other_item in other.items.items():
            if product_id not in self.items:
                self.items[product_id] = other_item
            else:
                our_item = self.items[product_id]
                # Take max quantity per product so neither addition gets lost
                merged_quantity = max(our_item.quantity, other_item.quantity)
                # Latest timestamp wins for everything else
                if other_item.timestamp > our_item.timestamp:
                    self.items[product_id] = CartItem(
                        product_id,
                        merged_quantity,
                        other_item.timestamp
                    )
                else:
                    self.items[product_id] = CartItem(
                        product_id,
                        merged_quantity,
                        our_item.timestamp
                    )

    def get_items(self):
        return [item for item in self.items.values() if item.quantity > 0]

The merge takes the max quantity per product_id so you never lose an item that was added on another device. Everything else uses timestamp-based last-write-wins.

JSON CRDT: Nested Structures Without Conflicts

Apps need to sync nested JSON-like structures: maps with string keys, arrays with ordered elements, scalars. Automerge and Yjs handle this by treating every operation (set, insert, delete) as a timestamped, actor-tagged unit that merges deterministically.

Automerge stores documents as operation trees. Each op carries a timestamp and actor ID. When merging, it computes the transitive closure of all ops and applies them in deterministic order (actor ID breaks ties).

Yjs takes a similar path, giving every array element and map key a unique ID. Concurrent changes resolve like this:

  • Scalar values (strings, numbers, booleans): latest timestamp wins
  • Map keys: latest timestamp wins per key
  • Array elements: inserts get unique IDs; concurrent inserts at the same spot ordered by ID
  • Nested objects: recursive merge, where each sub-object is itself a CRDT
class JSONScalar:
    """A scalar value with LWW semantics."""
    def __init__(self, value, timestamp, actor):
        self.value = value
        self.timestamp = timestamp
        self.actor = actor

    def merge(self, other):
        if other.timestamp > self.timestamp:
            return other
        elif other.timestamp == self.timestamp and other.actor > self.actor:
            return other
        return self

class JSONMap:
    """A JSON object/map with LWW semantics per key."""
    def __init__(self):
        self.fields = {}  # key -> JSONValue

    def set(self, key, value, timestamp, actor):
        self.fields[key] = JSONScalar(value, timestamp, actor)

    def get(self, key):
        return self.fields.get(key)

    def merge(self, other: 'JSONMap'):
        for key, other_val in other.fields.items():
            if key not in self.fields:
                self.fields[key] = other_val
            else:
                self.fields[key] = self.fields[key].merge(other_val)

class JSONArray:
    """A JSON array with CRDT semantics for concurrent inserts."""
    def __init__(self):
        self.elements = []  # List of (id, JSONValue)

    def insert(self, id, value, timestamp, actor):
        self.elements.append((id, JSONScalar(value, timestamp, actor)))

    def merge(self, other: 'JSONArray'):
        # Union of elements, sorted by (timestamp, actor) for determinism
        other_map = {e[0]: e[1] for e in other.elements}
        self_map = {e[0]: e[1] for e in self.elements}
        self_map.update(other_map)
        self.elements = sorted(
            [(id, val) for id, val in self_map.items()],
            key=lambda x: (x[1].timestamp, x[1].actor)
        )

Both Automerge and Yjs use structural sharing to avoid copying entire documents on every change. When you modify a deeply nested field, only the path to that field is copied; sibling branches are shared with the previous version. This makes CRDT documents efficient even when histories grow long.

Where CRDTs Are Used

Collaborative Editing

Figma uses CRDTs for real-time collaboration. Each edit is an operation that can be merged without conflicts. Multiple users can edit the same document simultaneously.

Distributed Databases

Riak uses CRDTs for certain data types. Cassandra’s lightweight transactions use a form of CRDT-like conflict resolution.

Messaging Systems

WhatsApp uses CRDTs for message synchronization. When you send a message offline, it is stored locally. When connectivity returns, it is merged with the server state without conflicts.

Shopping Carts

Amazon’s shopping cart reportedly uses CRDTs. Your cart is replicated across nodes. Concurrent modifications (add from phone, remove from desktop) merge correctly.

Implementing CRDTs in Practice

Most languages have CRDT libraries. For Python:

# Using the crdt library
from crdt import GCounter, PN_Counter, LWW_Register, OR_Map

# Create counters on different nodes
node_a_counter = PN_Counter()
node_b_counter = PN_Counter()

# Each node increments independently
node_a_counter.increment('node_a')
node_b_counter.increment('node_b')

# Merge when nodes communicate
node_a_counter.merge(node_b_counter)

# Result is the sum of all increments
print(node_a_counter.value)

Production Failure Scenarios

Data Integrity Issues

Tombstone Explosion in Production

A team deployed an OR-Set-based feature flag system at scale. Each A/B test toggle created new tags on add and accumulated tombstones on remove. After two years of continuous experimentation with thousands of flags being added and removed daily, the tombstone set grew to 8GB per replica. Memory-constrained edge nodes began crashing. The fix required implementing a generation-based garbage collection scheme with sliding-window compaction. Lesson: OR-Set tombstones must be bounded before production deployment at scale.

Clock Skew Causing LWW Data Loss

A distributed cache used LWW-Register for user session tokens. When a datacenter’s NTP server misbehaved for 30 minutes, clocks skewed forward by 5 minutes. Sessions updated during that window had inflated timestamps. After the skew was corrected, merges preferred the pre-skew timestamps, causing tokens to revert to old states. Users were logged out unexpectedly. The team switched to Hybrid Logical Clocks (HLC) and added timestamp sanity checks. Lesson: never trust wall-clock timestamps alone in distributed systems.

Merge-order Sensitivity with Malformed Input

An implementation of RGA allowed inserting after a non-existent ID (inserting at head without explicit marker). Two nodes independently inserted at “head” position concurrently, creating ordering ambiguity. When merged, the elements appeared in different orders on different replicas, causing UI divergence. The bug was in the positioning logic, not the CRDT itself. Lesson: CRDT correctness depends on correct implementation of the positioning/ordering logic.

Semantic and Application-Level Violations

Concurrent Decrement on PN-Counter Causing Negative Values

A banking application used PN-Counter for balance tracking. Due to network latency, two concurrent decrements both read the same positive balance before either wrote. Both decrements applied successfully, resulting in a negative balance that violated business rules. The CRDT merge was correct, but the application logic assumed non-negative intermediate states. Lesson: CRDTs do not protect against semantic violations; application-level invariants must be validated post-merge.

JSON CRDT Document Bloat in Long-running Sessions

A collaborative editing app using Automerge accumulated 50MB of operation history per session after 6 months of heavy editing. The document would not sync within reasonable time on mobile devices. Users had to export/import to reset state. The team had to implement history truncation policies with server-side snapshots. Lesson: CRDT documents need lifecycle management (snapshots, GC) for long-running collaborative sessions.

Partition Causing Quantity Disputes

During a flash sale, a shopping cart using Cart-Document CRDT experienced a 2-minute network partition. Three devices for the same user each added the same item concurrently. The PN-Counter portion tracked total quantity correctly, but the LWW timestamp field was ambiguous because all three writes happened at effectively the same time from different devices. Resolution required application-level “highest quantity wins” logic on top of the CRDT merge. Lesson: complex business rules on top of CRDTs may still need conflict resolution logic.

Common Pitfalls / Anti-Patterns

CRDTs are powerful but not a silver bullet.

Memory overhead: CRDTs maintain metadata (timestamps, tags, per-node state). A G-Counter with 1000 nodes needs 1000 counters even if most are zero.

Not all data types have CRDT forms: You cannot build a CRDT for a set with strong consistency guarantees on contains(). Some data just does not have a conflict-free representation.

Semantic limitations: LWW-Register loses data by design. If two concurrent writes happen, one value disappears. CRDTs do not solve the problem of data loss; they just make the loss predictable.

# CRDT does not preserve both values
node_a.set('x', 'hello')
node_b.set('x', 'world')
# After merge: only one survives
# Which one depends on timestamps, not intent

Timestamps are unreliable: Clocks skew across machines. Hybrid Logical Clocks (HLC) help, but timestamps are never perfect.

CRDTs vs Operational Transformation

Both solve collaborative editing, but differently.

AspectCRDTOperational Transformation
CoordinationNoneRequired for some transforms
ComplexitySimple data structuresComplex transformation functions
Server dependencyNot neededOften needed for correctness
MemoryGrows with historyCan be compact

For new systems, CRDTs are usually the better choice. They are simpler and do not require a central server.

CRDT Type Comparison

CRDT TypeOperationsMerge StrategyMemory OverheadUse Cases
G-CounterIncrement onlyMax per nodeO(N) countersPage views, vote counts
PN-CounterIncrement, DecrementMax for pos/negO(2N) countersShopping cart qty, bank balance
LWW-RegisterSetLatest timestamp winsO(1)User preferences, last-write-wins
OR-SetAdd, RemoveUnion of adds, subtracts removesO(A+R) entriesShopping carts, collaborative lists
RGAInsert, DeleteTimestamp+ID orderingO(N) nodesText editing, ordered sequences

Production Libraries

Most languages have CRDT library support:

# Python: pycrdt library
from pycrdt import Counter, LwwMap, ORMap

# Create a shared document
doc = pycrdt.Document()

# Add CRDTs to the document
counter = doc.get_or_create('counter', pycrdt.Int())
counter.increment()
// JavaScript/TypeScript: Yjs library
import * as Y from "yjs";

const doc = new Y.Doc();
const text = doc.getText("editor");
const map = doc.getMap("preferences");

// Automatic conflict resolution
text.insert(0, "Hello");
LibraryLanguageBest For
YjsJavaScript/TypeScriptCollaborative text editing
AutomergeJavaScript, Rust, PythonGeneral-purpose CRDT documents
pycrdtPythonPython-based systems
riak_dtErlangDistributed databases
crdtRubyRuby-based systems

Quick Recap Checklist

  • CRDTs merge concurrent updates without conflicts
  • G-Counter: increment-only counting
  • PN-Counter: bidirectional counting
  • LWW-Register: timestamp-based value selection
  • OR-Set: add/remove collections
  • CRDTs trade memory and semantics for coordination-free merging
  • Not all data types have CRDT representations
  • Use for: collaborative editing, distributed caches, offline-first apps

For more on eventual consistency, see Event-Driven Architecture. For conflict resolution patterns, see Consistency Models. For distributed data patterns, see Database Replication.

CRDTs are a powerful tool for building eventually consistent systems without coordination overhead. They are not right for every problem, but when you need multiple nodes to accept writes independently and merge cleanly, CRDTs provide mathematical guarantees that hand-rolled merge logic cannot match.

Interview Questions

1. What is a CRDT and what problem does it solve in distributed systems?

Expected answer points:

  • CRDT stands for Conflict-Free Replicated Data Type
  • Enables strongly consistent eventual consistency without coordination
  • Solves the conflict problem when multiple nodes accept writes concurrently and then synchronize
  • Mathematically designed to merge concurrent updates without conflicts
  • No central node required to sequence writes or resolve conflicts
2. Explain the difference between CmRDT and CvRDT.

Expected answer points:

  • CmRDT (Commutative Replicated Data Types): Operations themselves commute, meaning the order of operations does not matter
  • CvRDT (Convergent Replicated Data Types): States converge when merged, regardless of merge order
  • CvRDTs are more commonly used in practice because they are easier to reason about
  • CvRDTs work by having state that merges cleanly; CmRDTs require operation logs to replay
3. How does a G-Counter work and what are its limitations?

Expected answer points:

  • G-Counter is a grow-only counter that only increments, never decrements
  • Each node maintains its own count mapped by node_id
  • Merging takes the maximum value for each node across all replicas
  • The final value is the sum of all per-node counters
  • Limitation: Cannot decrement, only works for monotonic counters like page views
  • Memory overhead is O(N) where N is the number of nodes
4. What is a PN-Counter and how does it achieve bidirectional counting?

Expected answer points:

  • PN-Counter extends G-Counter to support both increments and decrements
  • Internally maintains two G-Counters: one for positive (increments) and one for negative (decrements)
  • Merge operation merges both counters separately using max strategy
  • Final value is positive counter minus negative counter
  • Use cases: shopping cart quantities, bank balances, any counter that needs to go up and down
5. How does Last-Write-Wins (LWW) Register handle concurrent updates?

Expected answer points:

  • LWW-Register holds a single value with an associated timestamp and node_id
  • When merging, the write with the latest timestamp wins
  • Ties are broken by node_id (higher ID wins)
  • Simple to implement but loses data by design when concurrent writes occur
  • Does not preserve both values in conflict - one value disappears permanently
6. Explain the OR-Set (Observed-Remove Set) and how it handles concurrent adds and removes.

Expected answer points:

  • OR-Set allows adding and removing elements with unique tags
  • Each (tag, value) pair can be added and removed independently
  • Maintains two sets: adds (tag -> object mapping) and removes (set of removed tags)
  • Merge is union of adds minus any tags in removes
  • Concurrent adds of same value with different tags are all preserved
  • Concurrent add and remove of same tag results in removal winning
7. What is tombstone accumulation in OR-Set and why is it a problem?

Expected answer points:

  • Tombstones are tags stored in the removes set when elements are deleted
  • These tombstone entries never get cleaned up in basic OR-Set implementations
  • After many add/remove cycles, the removes set grows without bound
  • Example: add/remove a user 100 times across 5 nodes = 500 tombstone entries
  • Especially painful in memory-constrained environments
  • Advanced variants like RGA use garbage collection or mergeable delete flags instead
8. How does RGA (Replicated Growable Array) handle concurrent inserts at the same position?

Expected answer points:

  • RGA is a CRDT for ordered sequences like text or lists
  • Each element has a timestamp and unique ID
  • Insert operations specify a position using the ID of the preceding element
  • Concurrent inserts at the same position are ordered by ID to ensure determinism
  • Merge combines all nodes from both replicas and re-sorts by timestamp then ID
  • Solves the concurrent edit problem in collaborative text editing
9. What are the trade-offs between CRDTs and Operational Transformation (OT)?

Expected answer points:

  • CRDTs require no coordination; OT often requires a central server for correctness
  • CRDTs use simple data structures; OT uses complex transformation functions
  • CRDTs do not need a server; OT systems often depend on one
  • CRDT memory grows with history; OT can be more compact
  • CRDTs are usually the better choice for new systems due to simplicity
  • OT was historically used by Google Docs; CRDTs used by Figma and others
10. What are the fundamental limitations of CRDTs?

Expected answer points:

  • Memory overhead: CRDTs maintain metadata (timestamps, tags, per-node state) - a G-Counter with 1000 nodes needs 1000 counters
  • Not all data types have CRDT forms: cannot build a CRDT for a set with strong consistency contains() guarantees
  • LWW-Register loses data by design: concurrent writes mean one value disappears
  • Timestamps are unreliable: clocks skew across machines; Hybrid Logical Clocks help but are not perfect
  • CRDTs do not solve data loss - they make it predictable but still present
11. What is the 2P-Set (Two-Phase Set) and when would you use it?

Expected answer points:

  • 2P-Set is stricter than OR-Set: you can add freely, but once removed, an element cannot be re-added
  • Two phases: add phase (elements go to AddSet) and remove phase (elements go to RemoveSet/tombstones)
  • Element is present only when in AddSet and not in RemoveSet
  • Good fit: versioned configuration where deprecation is permanent
  • Use cases: feature flags, configuration keys, any domain where items are deprecated forever
12. How do Hybrid Logical Clocks (HLC) improve timestamp reliability in CRDTs?

Expected answer points:

  • Physical clocks (NTP-synced) can skew or jump, making timestamps unreliable
  • HLC combines physical time with logical counters to create causally consistent timestamps
  • HLC guarantees that if event A causally precedes event B, then HLC(A) < HLC(B)
  • Even when physical clocks diverge, the logical component maintains ordering
  • Useful for LWW-Register and other timestamp-based CRDTs to ensure deterministic merges
  • Does not eliminate clock issues entirely but provides bounded guarantee
13. How does Automerge represent and merge nested JSON structures?

Expected answer points:

  • Automerge stores documents as operation trees with timestamp and actor ID per op
  • When merging, computes transitive closure of all ops and applies in deterministic order
  • Actor ID breaks ties when timestamps are equal
  • Uses structural sharing to avoid copying entire documents on every change
  • Only the path to a modified field is copied; sibling branches are shared
  • Makes CRDT documents efficient even when histories grow long
14. What is the Cart-Document CRDT pattern and how does its merge work?

Expected answer points:

  • Models shopping cart as a document with items, each having product_id, quantity, and timestamp
  • Supports concurrent adds, removes, and quantity updates from multiple devices
  • Merge takes max quantity per product_id so additions from different devices are preserved
  • Other fields use timestamp-based last-write-wins
  • Handles the case where you add from phone and remove from desktop correctly
15. Which production systems use CRDTs and for what purposes?

Expected answer points:

  • Figma: real-time collaboration with CRDTs for each edit that merges without conflicts
  • Riak: distributed database using CRDTs for certain data types
  • Cassandra: lightweight transactions use CRDT-like conflict resolution
  • WhatsApp: message synchronization for offline messages that merge without conflicts
  • Amazon shopping cart: reportedly uses CRDTs for replication across nodes
  • Yjs: JavaScript CRDT library used by many collaborative apps
16. What is vector clock vs CRDT and when would you use each?

Expected answer points:

  • Vector clocks track causality by maintaining a vector of counters (one per node)
  • Vector clocks detect conflicts but do not resolve them - they just identify concurrent events
  • CRDTs are data structures with built-in merge semantics that resolve conflicts automatically
  • Use vector clocks when you need to detect causality and make application-specific merge decisions
  • Use CRDTs when you want automatic conflict-free merging without application logic
  • Some systems combine both: vector clocks for causality tracking, CRDTs for data
17. Explain delta-based CRDTs and their advantage over state-based CRDTs.

Expected answer points:

  • State-based CRDTs send the entire state on merge, which grows with all history
  • Delta-based CRDTs only send the delta (change) since last synchronization
  • Delta mutation (m) produces delta state d such that state + d = new state
  • Reduces network bandwidth significantly for large CRDTs with long histories
  • Requires maintaining delta logs to compute deltas from full state
  • Commonly used in mobile/offline-first applications where bandwidth is constrained
18. What is the relationship between eventual consistency and CRDTs?

Expected answer points:

  • Eventual consistency guarantees that if no new updates are made, all replicas will eventually converge
  • CRDTs provide strong eventual consistency (SEC): convergence plus the guarantee that any two valid merges produce the same state
  • Traditional eventual consistency requires conflict resolution logic; CRDTs make conflicts mathematically impossible
  • CRDTs enable coordination-free updates: any node can accept writes without waiting for others
  • The "strong" in SEC means the convergence property holds regardless of merge order
19. How do CRDTs handle network partitions and what are the trade-offs?

Expected answer points:

  • CRDTs handle partitions gracefully: any node can continue accepting writes
  • When partitions heal, replicas merge using CRDT semantics - no data is lost (except LWW which may lose)
  • Trade-off: divergent histories accumulate during partition, creating more merge work
  • Memory grows faster during partitions because more concurrent operations accumulate
  • No need for CAP theorem trade-off decisions - CRDTs are always available and eventually consistent
  • However, CRDTs do not guarantee strong consistency (CP) - reads may return stale data
20. What are garbage collection strategies for CRDT tombstones?

Expected answer points:

  • Tombstones (removal markers) accumulate indefinitely in basic OR-Set and 2P-Set
  • Hybrids approach: periodically compact tombstones using a generation/epoch number
  • G-Counting: maintain per-element add and remove counts, GC when adds = removes for an element
  • Message-stability based: GC tombstones once all known replicas have acknowledged the operation
  • Sliding-window: only keep tombstones within a bounded window of recent operations
  • Absence-based GC: if an element is present in all replicas and no pending operations exist, remove tombstones

Further Reading

Academic Papers

  • Shapiro et al. (2011) - “A Comprehensive Study of Convergent and Commutative Replicated Data Types” - The foundational paper defining CvRDTs and CmRDTs
  • Bieniusa et al. (2012) - “Brief Announcement: The Case for Fast Update Propagation in CRDTs” - Analyzes merge performance characteristics
  • Bailis et al. (2014) - “Coordination Avoidance in Database Systems” - Relates CRDT concepts to database consistency

Books

  • “Understanding Distributed Systems” by Roberto Vitillo - Chapter on consistency models covers CRDTs
  • “Designing Data-Intensive Applications” by Martin Kleppmann - Chapter on distributed data includes CRDT comparison

Libraries and Implementations

Specification

  • CRDT Standards Track - IETF draft on CRDT specifications for interoperability
  • Hybrid Logical Clocks (HLC) - for more reliable timestamp ordering
  • Vector Clocks - for causality tracking without automatic merge
  • Operational Transformation (OT) - alternative approach used by Google Docs
  • Event Sourcing - patterns for CRDT state management

Conclusion

CRDTs represent a powerful mathematical framework for building distributed systems where eventual consistency is acceptable. They eliminate the need for coordination by encoding merge semantics directly into the data structure itself, making them ideal for collaborative applications, offline-first systems, and distributed databases that must remain available during network partitions.

The choice of CRDT type depends on your operation semantics: use G-Counters for simple increment-only counters, PN-Counters when you need decrement capability, LWW-Registers for last-write-wins semantics, and OR-Sets or RGA for collections. For complex nested structures, Cart-Document CRDTs or JSON CRDTs provide the most flexibility at the cost of higher storage overhead.

In production, CRDTs power applications like conflict-free collaborative editing (Figma, Notion), distributed databases (Riak, AntidoteDB), messaging systems (WhatsApp, Discord), and distributed ledgers. Understanding the mathematical properties of your chosen CRDT—and its limitations—is essential for building systems that behave correctly under all network conditions.

Category

Related Posts

Asynchronous Replication: Speed and Availability at Scale

Learn how asynchronous replication works in distributed databases, including eventual consistency implications, lag monitoring, and practical use cases where speed outweighs strict consistency.

#distributed-systems #replication #eventual-consistency

Multi-Paxos

Multi-Paxos extends basic Paxos to achieve consensus on sequences of values, enabling practical replicated state machines for distributed systems and databases.

#distributed-systems #consensus #paxos

Paxos Consensus Algorithm

Paxos is a family of consensus algorithms for achieving agreement in distributed systems despite failures, pioneered by Leslie Lamport.

#distributed-systems #consensus #paxos