Synchronous Replication: Data Consistency Across Nodes
Learn about synchronous replication patterns in distributed databases, including Quorum-based replication and write-ahead log shipping for zero RPO.
Introduction
Synchronous replication is a replication strategy where the primary (leader) node waits for confirmation that writes have been successfully applied to a quorum of replicas before acknowledging the write to the client. This ensures that data is durable across multiple nodes before the write is considered complete, making it ideal for workloads where zero data loss is non-negotiable.
Unlike asynchronous replication, which returns immediately after the local write, synchronous replication introduces latency proportional to the round-trip time between the primary and its replicas. In exchange, it provides strong consistency guarantees and eliminates scenarios where a successful write is silently lost due to a node failure.
This document covers how synchronous replication works under the hood, the different protocols and configurations available (including quorum-based replication and semi-synchronous replication), and the practical trade-offs you need to understand before deploying it in production.
Core Concepts
Synchronous replication blocks the commit operation until replicas confirm they have written the data. The primary sends the transaction to replicas, waits for acknowledgment, then returns success to the client.
sequenceDiagram
participant Client
participant Primary
participant Replica1
participant Replica2
Client->>Primary: BEGIN; UPDATE...; COMMIT
Primary->>Replica1: Send WAL entry
Primary->>Replica2: Send WAL entry
Replica1-->>Primary: ACK received
Replica2-->>Primary: ACK received
Primary-->>Client: COMMIT confirmed
The protocol varies. Some systems use write-ahead log (WAL) shipping. PostgreSQL uses WAL; MySQL Group Replication uses certification.
The Cost: Latency
Every synchronous write waits for a network round trip to at least one replica. In a single-datacenter setup, this latency might be 1-5ms. Cross-datacenter latency jumps to 20-100ms or more depending on geography.
# Synchronous replication adds latency
start = time.time()
result = db.execute("UPDATE accounts SET balance = ? WHERE id = ?", new_balance, account_id)
# This blocks until replica confirms
end = time.time()
print(f"Write latency: {end - start}ms") # Typically 5-50ms with sync repl
This latency is the reason most systems default to async. For critical data, though, the durability guarantee is worth the cost.
Quorum-Based Replication
Modern synchronous replication often uses quorum-based approaches. With a quorum, you write to multiple nodes but only wait for a majority to confirm.
graph TD
subgraph Quorum Write
Client1[Client] --> Write[Write to N nodes]
Write --> Q{Quorum?}
Q -->|Write confirmed| Success[Success to Client]
end
subgraph "3 Node Cluster"
N1[(Node 1)]
N2[(Node 2)]
N3[(Node 3)]
end
Write --> N1
Write --> N2
Write --> N3
N1 --> Q
N2 --> Q
N3 --> Q
With 3 nodes, you need 2 confirmations (majority). With 5 nodes, you need 3. The quorum provides fault tolerance: you can lose floor(N/2) nodes and still have a majority to confirm writes.
# Quorum calculation
def required_quorum(total_nodes):
return (total_nodes // 2) + 1
# 3 nodes require 2 acks
print(required_quorum(3)) # 2
# 5 nodes require 3 acks
print(required_quorum(5)) # 3
# 7 nodes require 4 acks
print(required_quorum(7)) # 4
Google Spanner uses this approach. So does CockroachDB. Both provide strong consistency across globally distributed nodes.
Write-Your-Write Consistency
Synchronous replication gives you write-your-write consistency by default. If a client writes data and then immediately reads it, the read will see that write. No special handling needed.
-- With synchronous replication, this always works
BEGIN;
UPDATE users SET email = 'new@example.com' WHERE id = 1;
COMMIT;
-- Immediate read returns the new email
SELECT email FROM users WHERE id = 1;
-- Returns: 'new@example.com'
With async replication, that immediate read might hit a replica that has not yet received the update. You would see the old email.
Single-Leader vs Multi-Leader
Synchronous replication works with both single-leader and multi-leader topologies.
Single-leader is simpler. All writes go to one primary. The primary coordinates synchronous replication to followers. PostgreSQL streaming replication in synchronous mode uses this topology.
Multi-leader allows writes to any node. Each leader synchronously replicates to other leaders. This is more complex because concurrent writes to the same row can conflict.
graph TD
subgraph Single Leader
L1[Leader] --> F1[Follower 1]
L1 --> F2[Follower 2]
end
subgraph Multi Leader
L3[Leader A] <--> L4[Leader B]
L4 <--> L5[Leader C]
L5 <--> L3
end
Most production systems use single-leader synchronous replication. Multi-leader conflicts are genuinely hard to handle correctly.
Configuring Synchronous Replication
PostgreSQL
-- Enable synchronous replication
ALTER SYSTEM SET synchronous_commit = on;
ALTER SYSTEM SET synchronous_standby_names = 'replica1,replica2';
-- Verify status
SELECT client_addr, state, sent_lsn, write_lsn,
flush_lsn, replay_lsn,
sent_lsn - replay_lsn AS replication_lag
FROM pg_stat_replication;
MySQL Group Replication
-- Configure group replication as synchronous
SET GLOBAL group_replication_single_primary_mode = OFF;
SET GLOBAL group_replication_enforce_update_everywhere_checks = ON;
START GROUP_REPLICATION;
CockroachDB
CockroachDB uses the Raft consensus algorithm for synchronous replication. By default, it writes to 3 nodes and waits for 2 confirmations.
-- CockroachDB uses Range replication by default
-- 3-way synchronous replication is built in
SHOW EXPERIMENTAL_RANGES FROM TABLE users;
Semi-Synchronous Replication
Pure synchronous replication can block forever if a replica goes down. Semi-synchronous replication offers a middle ground: wait for at least one replica, but do not block if none respond.
graph TD
P[Primary] --> R1[Replica 1]
P --> R2[Replica 2]
R1 -->|confirms| P
P -->|after 1 ack| Client[Client notified]
R2 -.->|timeout| P
PostgreSQL supports this via the synchronous_commit = remote_apply setting. MySQL has similar functionality with semi-sync replication plugins.
When Synchronous Replication Makes Sense
Synchronous replication is not always the answer. Use it when:
- Financial transactions: Ledgers, payments, anything where data loss is unacceptable
- Inventory systems: You cannot oversell because a replica was behind
- Regulatory requirements: Some compliance frameworks mandate zero data loss (RPO = 0)
- Critical metadata: User accounts, permissions, access control lists
# Example: Synchronous replication for financial transactions
def transfer_funds(from_account, to_account, amount):
with db.transaction():
# These writes confirm synchronously
db.execute("UPDATE accounts SET balance = balance - ? WHERE id = ?",
amount, from_account)
db.execute("UPDATE accounts SET balance = balance + ? WHERE id = ?",
amount, to_account)
# Transaction only commits when replicas confirm
Synchronous Replication Failure Flow
Understanding what happens when components fail is critical for designing resilient systems.
flowchart TD
Start[Write Request] --> Primary[Primary receives write]
Primary --> Send1{Send to Replica 1}
Primary --> Send2{Send to Replica 2}
Send1 -->|ACK received| Wait2{Wait for Replica 2}
Send2 -->|ACK received| Wait1{Wait for Replica 1}
Send1 -->|timeout| Retry1{Retry timeout?}
Send2 -->|timeout| Retry2{Retry timeout?}
Retry1 -->|exceeded| Block[Primary blocks write]
Retry2 -->|exceeded| Block
Wait2 -->|both acks| Success[Write confirmed to client]
Wait1 -->|both acks| Success
Block -->|Replica recovers| Resume[Resume waiting]
Block -->|replica down long| FailWrite[Write fails - transaction rejected]
Failure scenarios:
| Scenario | Behavior | Impact |
|---|---|---|
| One replica times out | Primary blocks until timeout exceeds threshold | Write latency increases |
| Both replicas timeout | Write fails, transaction rejected | Client must retry |
| Primary fails before commit | No data loss (no commit until replicas confirm) | Automatic failover to replica |
| Network partition | Primary cannot reach quorum | Primary blocks or fails |
Common Pitfalls
-
Configuring sync with only one replica: If you have synchronous replication configured but only one replica, losing that replica blocks all writes. Always have at least two replicas for synchronous mode.
-
Mixing sync and async modes: Some databases let you configure which replicas use sync vs async. Mixing modes creates confusing behavior where some data is durable and some is not.
-
Ignoring latency during peak load: During high write throughput, synchronous replication latency compounds. Test under load, not just idle conditions.
-
Not testing failover: If you have never tested synchronous failover, you do not know if it works correctly. Schedule regular failover drills.
-
Geographic distribution: Synchronous replication across continents is usually impractical. If you need strong consistency with geographic distribution, consider synchronizing to a nearby replica and async-replicating to distant ones.
Split-Brain Prevention
Synchronous replication helps prevent split-brain scenarios. If the primary fails and a replica promotes, that replica has all committed data because writes only confirmed after replicas received them.
But you still need quorum-based failover logic. Multiple replicas might think the primary is down simultaneously. Use consensus-based leader election to ensure only one replica promotes.
graph TD
M[Monitor] --> P1{Primary alive?}
P1 -->|no| E[Election]
E --> Q{Quorum agrees?}
Q -->|yes| P2[Promote replica]
Q -->|no| E
P2 --> NM[New Primary]
Quick Recap
- Synchronous replication blocks until replicas confirm writes
- Latency cost: 1-50ms per write depending on topology
- Quorum-based replication provides fault tolerance with majority confirmation
- Best for: financial data, inventory, anything requiring zero data loss
- Monitor replication lag, connection counts, and disk usage
- Test failover procedures before you need them
For more on replication strategies, see Database Replication for a broader overview. To understand the consistency trade-offs, read Consistency Models. If you are evaluating replication against other availability patterns, see Availability Patterns.
Trade-Off Comparison: Sync vs Semi-Sync vs Async
| Dimension | Synchronous | Semi-Synchronous | Asynchronous |
|---|---|---|---|
| Durability Guarantee | Full (all replicas confirm) | Partial (at least one) | Minimal (primary only) |
| Write Latency | Highest (waits for all) | Medium (waits for one) | Lowest (no wait) |
| Blocking Risk | High (any replica down) | Low (only primary down) | None |
| Data Loss Risk | None | Minimal | Possible |
| Availability | Lower | Medium | Highest |
| Typical RPO | Zero | Near-zero | Lag-dependent |
| Cross-DC Support | Poor (high latency) | Moderate | Excellent |
Monitoring Synchronous Replication
Watch these metrics:
-- PostgreSQL: Check replication health
SELECT client_addr, state, sent_lsn, write_lsn,
flush_lsn, replay_lsn,
replay_lag,
flush_lag,
write_lag
FROM pg_stat_replication;
-- MySQL: Group Replication status
SELECT MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE
FROM performance_schema.replication_group_members;
Key metrics to track:
| Metric | Warning | Critical |
|---|---|---|
| Replication lag | > 1s | > 10s |
| Replica state | — | off “streaming” |
| Disk usage | > 70% | > 85% |
| Connection count | > 80% max | > 95% max |
Interview Questions
Expected answer points:
- Synchronous replication blocks the commit operation until all replicas confirm they have written the data
- Async replication returns success to the client immediately without waiting for replica confirmations
- The key trade-off is latency (sync) vs potential data loss (async)
- Sync provides zero RPO; async RPO depends on replication lag
Expected answer points:
- Each write must be sent to replicas and the primary must wait for acknowledgments before confirming to the client
- Network round-trip time to at least one replica is added to every synchronous write
- Single-datacenter latency: 1-5ms typical
- Cross-datacenter latency: 20-100ms or more depending on geography
- Latency compounds during high write throughput periods
Expected answer points:
- Writes are sent to all nodes but only a majority (quorum) must confirm
- Quorum size = floor(N/2) + 1 where N is total nodes
- With 3 nodes, 2 confirmations needed; with 5 nodes, 3 needed
- Provides fault tolerance: can lose floor(N/2) nodes and still have a majority
- Examples: Google Spanner, CockroachDB use quorum-based sync replication
Expected answer points:
- Write-your-write consistency means if a client writes data and immediately reads it, the read will see that write
- Synchronous replication confirms writes only after replicas acknowledge receipt
- No special client-side handling required
- With async replication, immediate reads might hit a replica that has not yet received the update
Expected answer points:
- Single-leader: simpler topology, all writes go to one primary which coordinates replication to followers
- Multi-leader: writes can go to any node, each leader replicates to other leaders
- Multi-leader complexity: concurrent writes to the same row can conflict
- Most production systems use single-leader synchronous replication
- PostgreSQL streaming replication in synchronous mode uses single-leader topology
Expected answer points:
- Enable with: ALTER SYSTEM SET synchronous_commit = on;
- Specify standby servers: ALTER SYSTEM SET synchronous_standby_names = 'replica1,replica2';
- Monitor using pg_stat_replication view checking client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn
- Calculate replication lag: sent_lsn - replay_lsn
Expected answer points:
- Semi-synchronous waits for at least one replica to confirm, but does not block if no response
- Addresses the problem of pure sync blocking forever if a replica goes down
- PostgreSQL supports via synchronous_commit = remote_apply setting
- MySQL has similar functionality with semi-sync replication plugins
- Trade-off: slightly lower durability guarantee in exchange for better availability
Expected answer points:
- Primary blocks the write until timeout threshold is exceeded
- If timeout exceeds threshold, the primary retries or fails the write
- Write latency increases during replica timeout
- If both replicas timeout, the transaction is rejected and client must retry
- When replica recovers, normal operation resumes
Expected answer points:
- Writes only confirm after replicas receive them
- If primary fails and a replica promotes, that replica has all committed data
- However, quorum-based failover logic is still needed
- Multiple replicas might think primary is down simultaneously
- Consensus-based leader election ensures only one replica promotes
Expected answer points:
- Replication lag: warning > 1s, critical > 10s
- Replica state: should be "streaming"
- Disk usage on replicas: warning > 70%, critical > 85%
- Connection count: warning > 80% max, critical > 95% max
- LSN positions: sent_lsn, write_lsn, flush_lsn, replay_lsn in PostgreSQL
Expected answer points:
- Configuring sync with only one replica: losing that replica blocks all writes
- Mixing sync and async modes: creates confusing behavior where some data is durable and some is not
- Ignoring latency during peak load: sync replication latency compounds under load
- Not testing failover: regular failover drills are essential
- Geographic distribution: sync across continents is usually impractical
Expected answer points:
- CockroachDB uses the Raft consensus algorithm for synchronous replication
- Default configuration writes to 3 nodes and waits for 2 confirmations
- Uses Range replication (data is split into ranges)
- Each range has its own Raft group for consistency
- Provides strong consistency across globally distributed nodes
Expected answer points:
- Synchronous replication provides zero RPO
- No data loss because writes only commit after replicas confirm
- Semi-synchronous provides near-zero RPO
- Async RPO is lag-dependent (could be seconds to hours)
- RPO = 0 is required for financial transactions, inventory systems, and some compliance frameworks
Expected answer points:
- If primary cannot reach quorum, it blocks or fails writes
- The system becomes unavailable for writes until partition heals
- This is the availability trade-off for strong consistency
- During a partition, clients may receive timeouts or errors
- Read operations may still work depending on configuration
Expected answer points:
- When write latency tolerance is very low
- When data loss is acceptable (non-critical data)
- When geographic distribution spans continents (latency impractical)
- When highest availability is required (sync blocking is unacceptable)
- When write throughput is extremely high and cost/speed trade-off does not justify sync
Expected answer points:
- MySQL Group Replication uses certification-based approach, not WAL shipping
- MySQL can run in single-primary or multi-primary mode
- Configure multi-primary: SET GLOBAL group_replication_single_primary_mode = OFF;
- Enable update everywhere: SET GLOBAL group_replication_enforce_update_everywhere_checks = ON;
- PostgreSQL uses streaming replication with WAL shipping
Expected answer points:
- Synchronous replication favors Consistency over Availability in the CAP theorem
- During network partitions, sync replication systems become unavailable rather than inconsistent
- Async replication favors Availability over Consistency
- This is why sync replication is chosen for critical data where data loss is unacceptable
Expected answer points:
- Required quorum formula: (total_nodes // 2) + 1
- 3 nodes require 2 confirmations
- 5 nodes require 3 confirmations
- 7 nodes require 4 confirmations
- Majority requirement ensures at most one partition can have quorum
Expected answer points:
- No data loss occurs because the commit did not complete
- Transaction is rejected since replicas did not confirm
- Client receives failure and must retry
- Automatic failover to a replica that has all committed data occurs
- This is a key advantage over async replication
Expected answer points:
- Financial transactions: ledgers, payments, anything where data loss is unacceptable
- Inventory systems: cannot oversell because a replica was behind
- Regulatory requirements: compliance frameworks that mandate zero data loss (RPO = 0)
- Critical metadata: user accounts, permissions, access control lists
- Any scenario where the cost of data loss outweighs the latency cost of synchronous writes
Further Reading
- Database Replication - Broader overview of replication strategies
- Consistency Models - Understanding consistency trade-offs
- Availability Patterns - Replication vs other availability patterns
- Google Spanner Architecture - Google’s quorum-based sync replication paper
- CockroachDB Raft Documentation - Raft consensus in CockroachDB
Synchronous replication is not free, but for data that cannot be lost, it is worth the latency cost. The key is knowing which data truly needs it and configuring your system accordingly.
Conclusion
Synchronous replication is the right choice when consistency cannot be compromised and your network is fast and reliable. It guarantees that every write is durable on a quorum of replicas before acknowledging the client, eliminating ghost reads and split-brain scenarios at the cost of increased write latency. In practice, most production systems use semi-synchronous replication as a pragmatic middle ground—guaranteeing at least one replica is in sync while keeping latency manageable.
Use synchronous replication for financial transactions, inventory systems, and any workload where stale reads would cause data corruption or business loss. Avoid it for geo-distributed deployments with high network latency, write-heavy workloads that need global performance, or systems where availability must be maintained even during network partitions.
Monitoring synchronous replication requires tracking replica lag (even in sync mode), acknowledgment round-trip times, quorum health, and failover timing. When a synchronous replica falls behind or fails, the system must either pause writes or fall back to asynchronous mode—design your alerting and runbooks accordingly.
Category
Related Posts
Asynchronous Replication: Speed and Availability at Scale
Learn how asynchronous replication works in distributed databases, including eventual consistency implications, lag monitoring, and practical use cases where speed outweighs strict consistency.
Google Spanner: Globally Distributed SQL at Scale
Google Spanner architecture combining relational model with horizontal scalability, TrueTime API for global consistency, and F1 database implementation.
Consistency Models in Distributed Systems: Complete Guide
Learn about strong, weak, eventual, and causal consistency models. Understand read-your-writes, monotonic reads, and picking the right model for your system.