Google Spanner: Globally Distributed SQL at Scale

Google Spanner architecture combining relational model with horizontal scalability, TrueTime API for global consistency, and F1 database implementation.

published: reading time: 22 min read author: GeekWorkBench

Google Spanner: Globally Distributed SQL at Scale

Google Spanner solves a genuinely hard problem: managing petabytes of data across data centers worldwide while keeping the relational model that SQL developers rely on. It launched around 2007 as an internal project, influenced by what engineers called “Bigtable 2”, and became Cloud Spanner in 2017. The selling point was SQL plus horizontal scalability without throwing away ACID guarantees.


Architecture Overview

Spanner arranges data in layers that blend traditional database concepts with distributed systems patterns.

At the center is the directory, a B-tree-like structure that serves as the unit of data placement. Directories can move between servers and zones dynamically, which sounds complicated but happens transparently to running transactions.

The hierarchy: databases contain tables, tables contain directories, directories contain key-value pairs. Each directory falls under a Paxos group controlling replication.

graph TD
    A[Cloud Spanner Instance] --> B[Database 1]
    A --> C[Database 2]
    B --> D[Table: Users]
    B --> E[Table: Orders]
    D --> F[Directory: User Data]
    E --> G[Directory: Order Data]
    F --> H[Paxos Group - 3+ Replicas]
    G --> I[Paxos Group - 3+ Replicas]
    H --> J[Replica 1 - US East]
    H --> K[Replica 2 - US West]
    H --> L[Replica 3 - Europe]

Spanner stores data in tablets (inherited from Bigtable thinking). Each tablet spans a range of primary keys and is managed by a Paxos group. Paxos ensures all replicas agree before writes are acknowledged.


TrueTime: Global Consistency Through Clock Uncertainty

TrueTime is Spanner’s most distinctive feature. It uses GPS receivers and atomic clocks in each data center to produce globally consistent timestamps.

TrueTime does not claim clocks are perfectly synchronized. Instead, it makes clock uncertainty explicit as a bounded interval. Call TT.now() and you get back a TTinterval with earliest and latest possible timestamps:

# TrueTime returns a time interval, not a point
tt = TT.now()
# tt.earliest = timestamp that could be in the past
# tt.latest = timestamp that could be in the future
# The actual time is somewhere within this interval

The uncertainty bound sits around 1-7 milliseconds. Spanner leverages these intervals for external consistency: if transaction T1 commits before T2 starts in real-time, then T1’s timestamp is less than T2’s. That is the guarantee that matters in practice.

This means globally consistent reads without a single time source synchronizing everything. Read at a specific timestamp from anywhere on Earth, and Spanner returns results consistent with that moment.

TrueTime needs specialized hardware, which is why direct competitors are rare. But the conceptual model has influenced other systems to adopt “clock uncertainty” thinking.


Schema and SQL Support

Spanner gives you a relational schema with SQL. Unlike NoSQL systems that offer query capabilities as an afterthought, Spanner includes:

  • Standard SQL with JOINs, aggregations, subqueries
  • Inline primary keys and foreign keys
  • Interleaved tables for parent-child relationships
  • Generated columns and indexes
-- Spanner supports standard SQL with relational features
CREATE TABLE Users (
  UserId STRING(64) NOT NULL,
  Email STRING(256),
  CreatedAt TIMESTAMP,
) PRIMARY KEY (UserId);

CREATE TABLE Orders (
  OrderId STRING(64) NOT NULL,
  UserId STRING(64),
  Total DECIMAL(10,2),
  Status STRING(32),
) PRIMARY KEY (OrderId);

-- Foreign key support
ALTER TABLE Orders ADD CONSTRAINT FK_UserOrders
  FOREIGN KEY (UserId) REFERENCES Users(UserId);

-- Create index for common query pattern
CREATE INDEX idx_orders_user_status ON Orders(UserId, Status);

The INTERLEAVE IN clause co-locates related rows physically, cutting the need for distributed joins. When you interleave child rows with their parent, Spanner puts them in the same tablet. Queries run without network round-trips.

Partition Key Design

Choosing a good primary key matters more in Spanner than in a single-node database. Since data is range-partitioned by key, a poorly chosen key creates hotspots that degrade performance.

Avoid sequential keys for high-write tables. Auto-incrementing integers cause all writes to hit a single tablet. UUIDs or hashed keys spread load across nodes:

-- Bad: sequential key creates hotspots
CREATE TABLE Events (
    event_id INT64 AUTO_INCREMENT,
    payload STRING(MAX),
    PRIMARY KEY (event_id)
);

-- Good: uniformly distributed key
CREATE TABLE Events (
    event_id STRING(36) NOT NULL,
    payload STRING(MAX),
    PRIMARY KEY (event_id)
) PRIMARY KEY (event_id);

For read-heavy tables where you query by user ID or customer ID most often, use that ID as the leading key. For write-heavy tables generating many rows per user, consider a composite key with a suffix to distribute writes.

Interleaved Tables

Interleaved tables model parent-child relationships where child rows are co-located with their parent on the same storage node. This improves read performance when you frequently fetch a parent and its children together.

CREATE TABLE Orders (
    order_id STRING(36),
    customer_id STRING(36),
    total DECIMAL,
    created_at TIMESTAMP,
    PRIMARY KEY (order_id)
) PRIMARY KEY (order_id);

CREATE TABLE OrderItems (
    order_id STRING(36),
    item_id STRING(36),
    product_id STRING(36),
    quantity INT64,
    PRIMARY KEY (order_id, item_id)
) INTERLEAVE IN PARENT Orders ON DELETE CASCADE;

With INTERLEAVE IN PARENT Orders ON DELETE CASCADE, deleting an order automatically deletes its items. The real win is that fetching an order with its items stays in a single tablet read, avoiding a distributed query.

Trade-offs: Interleaving shines when you almost always fetch parent and child together. But if you often query children independently, it backfires: all items for a given parent land in the same tablet, which becomes a hotspot for popular parents. Deleting the parent also gets pricier since Spanner has to delete all the children first.


Transaction Handling

Spanner handles two transaction types: read-write and read-only.

Read-write transactions run two-phase commit (2PC) over Paxos. Nodes prepare first, promising to commit if asked. Then the coordinator tells them to commit or abort. This gives atomicity across distributed nodes, but adds latency.

Read-only transactions skip locks. They take a timestamp bound and read from consistent snapshots. No blocking, no contention.

Spanner also supports partitioned transactions for high-contention workloads, splitting large writes across servers for better throughput.

sequenceDiagram
    participant C as Coordinator
    participant P1 as Participant 1
    participant P2 as Participant 2

    Note over C,P2: Two-Phase Commit Protocol

    C->>P1: Prepare
    C->>P2: Prepare
    P1-->>C: Promise to Commit
    P2-->>C: Promise to Commit

    Note over C: Coordinator decides COMMIT

    C->>P1: Commit
    C->>P2: Commit
    P1-->>C: Committed
    P2-->C: Committed

Replication and Consistency

Spanner uses Paxos-based replication with configurable replica counts (typically 3-5), replica placement across regions for disaster recovery, and witness replicas that participate in Paxos without storing data.

The Paxos leader handles all reads and writes, but read leases let followers serve reads locally for a period. This cuts latency for read-heavy workloads.

By default, Spanner provides external consistency, stronger than serializability. If T1 commits before T2 starts in real-time, T1 happens before T2. Most applications need exactly this guarantee.


Direct Connections and Performance

Google recommends direct connections over the Proxy. Direct connections use gRPC and are faster, especially for apps running in Google Cloud.

Rough numbers:

  • Write latency: 10-50ms depending on distance and replica setup
  • Read latency: 1-10ms for local replica reads
  • Throughput scales horizontally as you add nodes

Each node handles about 10,000 QPS reads or 2,000 QPS writes. Your mileage varies with workload patterns.


Use Cases and Trade-offs

Spanner works well for:

  • Globally distributed databases needing strong consistency
  • Financial systems requiring ACID transactions across regions
  • Multi-tenant SaaS needing isolation and scalability
  • Inventory or order management with consistent cross-region updates

Shortcomings:

  • Cost: Not cheap compared to self-managed databases
  • TrueTime dependency: Needs Google’s hardware for full benefit
  • Latency floor: Even local reads have 1-2ms minimum
  • Query optimization: Less mature than PostgreSQL or MySQL

Production Failure Scenarios

FailureImpactMitigation
Zone outage (single AZ failure)Spanner keeps serving from other zones; TrueTime bounds how stale reads can getMulti-zone deployment; Spanner replicates synchronously across zones
Leader election during high write loadShort unavailability or latency spike while a new leader takes overTrueTime uncertainty caps the maximum outage window; consider allowing leaderless reads
TrueTime hardware failureTrueTime uncertainty grows until you replace the hardware; reads can return slightly stale dataMonitor clock synchronization closely; replace TrueTime hardware quickly when errors surface
Schema change during heavy loadOnline schema changes hit limits on very large tablesPlan schema changes for low-traffic periods; be careful with ALTER TABLE on big datasets
Cross-region replication lagNon-local replicas see higher read latencyRoute read/write traffic to regional endpoints; globally consistent reads inherently need cross-region latency
Network partitionIsolated region cannot reach the replica majoritySpanner stays available in the majority partition; isolated region serves cached data until the partition heals
Transaction conflict rate spikeMore retries and latency when concurrent transactions touch overlapping key rangesPrefer read-only transactions where feasible; structure your key schema to reduce write contention

Common Pitfalls / Anti-Patterns

Choose Spanner When

  • You need global distribution with writes happening in multiple geographic regions simultaneously
  • Your application requires strict serializability or external consistency across regions
  • You want horizontal scalability without abandoning SQL and relational modeling
  • You need managed operations with automatic sharding, replication, and failover
  • Compliance requires data locality controls with regional redundancy
  • Your team lacks database operations expertise and needs a hands-off operational model

Avoid Spanner When

  • Your workload is single-region and does not need global distribution
  • You are cost-sensitive and can self-manage PostgreSQL/MySQL with read replicas
  • You have high-frequency short transactions where the latency floor matters (e.g., real-time bidding)
  • You need advanced query optimization features found in mature relational databases
  • Your data fits on a single node or requires minimal horizontal scaling
  • You require open-source flexibility without vendor lock-in

Decision Framework

flowchart TD
    A[Need Horizontal Scale?] --> |Yes| B[Need Global Distribution?]
    A --> |No| E[Consider PostgreSQL or MySQL]
    B --> |Yes| C[Need Strict Consistency?]
    B --> |No| F[Consider DynamoDB or Cassandra]
    C --> |Yes| D[Can Afford Premium Pricing?]
    C --> |No| F
    D --> |Yes| G[Spanner is a Strong Fit]
    D --> |No| H[Consider CockroachDB or Aurora Global]


Quick Recap Checklist

  • Spanner arranges data in directories (B-tree-like structures) grouped into Paxos groups for replication
  • TrueTime uses GPS and atomic clocks to produce bounded clock uncertainty intervals, enabling globally consistent reads
  • Schema supports standard SQL with foreign keys, interleaved tables, generated columns, and indexes
  • Read-write transactions use two-phase commit over Paxos; read-only transactions use timestamp-bound snapshots
  • Primary key design matters: avoid sequential keys for write-heavy workloads to prevent hotspots
  • Interleaved tables co-locate parent-child rows but can create hotspots for popular parents
  • External consistency (stronger than serializable) comes from TrueTime’s commit-wait rule
  • Spanner works best for globally distributed OLTP, financial systems, and multi-tenant SaaS
  • Cost scales linearly with nodes: storage ($0.17-0.25/GB/mo), compute ($0.045-0.09/vCPU-hour)

Interview Questions

1. What problem does Google Spanner solve, and how does it differ from traditional distributed databases?

Expected answer points:

  • Spanner solves the problem of globally distributed SQL: petabytes of data across data centers with relational model and ACID guarantees
  • Unlike early distributed databases that sacrificed consistency or the relational model, Spanner keeps both
  • Key differentiator: TrueTime hardware enables external consistency without a single synchronization point
2. Explain how TrueTime works and why its bounded uncertainty model matters for distributed consistency.

Expected answer points:

  • TrueTime uses GPS receivers and atomic clocks in each data center to generate globally consistent timestamps
  • It returns a time interval (TTinterval) with earliest and latest possible timestamps, not a single point
  • The uncertainty bound is roughly 1-7 milliseconds
  • This enables external consistency: if T1 commits before T2 starts in real time, T1's timestamp is less than T2's
  • Without bounded uncertainty, you need a single synchronized time source (which fails across continents)
3. What is the commit-wait rule in Spanner, and how does it relate to TrueTime?

Expected answer points:

  • The commit-wait rule: a transaction must wait until its commit timestamp is in the past (past the TrueTime uncertainty bound)
  • This guarantees that any transaction reading at timestamp T sees all transactions that committed before T
  • Wait time is bounded by the maximum clock uncertainty (typically under 10ms)
  • This is what gives Spanner external consistency without synchronized clocks
4. How does Spanner's directory-based data model work, and how does it relate to Paxos groups?

Expected answer points:

  • Directories are B-tree-like structures that serve as the unit of data placement
  • Directories can move between servers and zones dynamically without affecting running transactions
  • Each directory falls under a Paxos group controlling replication across 3-5 replicas
  • Hierarchy: databases → tables → directories → key-value pairs
5. What is the difference between read-write and read-only transactions in Spanner?

Expected answer points:

  • Read-write transactions run two-phase commit (2PC) over Paxos: nodes prepare first, then coordinator tells them to commit or abort
  • Read-only transactions skip locks entirely and read from consistent snapshots at a timestamp bound
  • Read-only transactions have no blocking or contention, making them much faster for consistent reads
  • Spanner also supports partitioned transactions for high-contention workloads
6. Why is primary key design critical in Spanner, and what patterns should you avoid?

Expected answer points:

  • Data is range-partitioned by primary key, so poorly chosen keys create hotspots that degrade performance
  • Sequential keys (like auto-incrementing integers) cause all writes to hit a single tablet
  • UUIDs or hashed keys distribute load across nodes for write-heavy workloads
  • For read-heavy tables, use the most frequently queried ID as the leading key
  • For write-heavy tables with many rows per user, consider composite keys with a suffix to spread writes
7. What are interleaved tables, when should you use them, and what are the trade-offs?

Expected answer points:

  • Interleaved tables model parent-child relationships where child rows are co-located with their parent on the same storage node
  • Benefit: fetching a parent with its children stays in a single tablet read, avoiding distributed queries
  • ON DELETE CASCADE automatically cleans up child rows when the parent is deleted
  • Trade-off: if you frequently query children independently, all children of a parent land in the same tablet, creating a hotspot
  • Deleting the parent becomes expensive because Spanner must delete all children first
8. How does Spanner's Paxos-based replication work, and what is the role of the Paxos leader?

Expected answer points:

  • Paxos ensures all replicas agree before writes are acknowledged
  • Typically 3-5 replicas are placed across regions for disaster recovery
  • The Paxos leader handles all reads and writes
  • Read leases let followers serve reads locally for a period, reducing latency for read-heavy workloads
  • Witness replicas participate in Paxos without storing data
9. What does external consistency mean in Spanner, and how is it stronger than serializability?

Expected answer points:

  • External consistency: if transaction T1 commits before T2 starts in real-time, then T1 happens before T2
  • This is stronger than serializability (which only guarantees equivalent to some serial order) because it preserves real-time ordering
  • Most applications need exactly this guarantee: actions that happened before appear before to all observers
  • Achieved through TrueTime's commit-wait rule
10. What are Spanner's main limitations and when should you avoid using it?

Expected answer points:

  • Cost: premium pricing compared to self-managed databases
  • TrueTime dependency: needs Google's hardware for full benefit (competitors use HLC instead)
  • Latency floor: even local reads have 1-2ms minimum due to Paxos overhead
  • Query optimization less mature than PostgreSQL or MySQL
  • Avoid when: workload is single-region, cost-sensitive, has high-frequency short transactions, or data fits on a single node
11. How does Spanner's two-phase commit work over Paxos, and what latency implications does it have?

Expected answer points:

  • Coordinator asks all participant nodes to prepare: they promise to commit if asked
  • Once a majority (Paxos group) promises, coordinator decides COMMIT
  • Coordinator tells all participants to commit, and they acknowledge
  • This adds latency compared to single-node commits: round trips between distributed nodes
  • Cross-region writes can take 10-50ms depending on distance and replica setup
12. What is the F1 database, and what does it demonstrate about Spanner's production readiness?

Expected answer points:

  • F1 is Google's ad business database that was migrated from Bigtable to Spanner
  • It validates Spanner can handle a real mission-critical workload: thousands of tables, complex joins, OLTP alongside analytical queries
  • Strict SLAs for latency and availability had to be met during and after migration
  • Proves Spanner is production-hardened, not just a research prototype
13. How does Spanner handle zone outages, and what guarantees does it provide during partial failures?

Expected answer points:

  • Spanner keeps serving from other zones when a single AZ fails; replicas in other zones continue operating
  • TrueTime bounds how stale reads can get during outage periods
  • Network partition: isolated region cannot reach replica majority, so Spanner stays available in the majority partition
  • Isolated region serves cached data until partition heals
14. How does Spanner's pricing work, and what are the main cost components?

Expected answer points:

  • Three main components: storage ($0.17-0.25/GB/month), compute ($0.045-0.09/vCPU-hour), and networking egress
  • Storage: regional ~$0.17/GB, multi-region ~$0.25/GB
  • Compute: regional ~$0.045/vCPU-hour, multi-region ~$0.09/vCPU-hour
  • Egress charges apply when data leaves Google Cloud infrastructure
  • Costs scale linearly with nodes, so you can start small and scale up
15. Compare Spanner with CockroachDB and Aurora on consistency, deployment, and global distribution.

Expected answer points:

  • Spanner: external consistency via TrueTime (GPS + atomic clocks), fully managed cloud, native multi-region
  • CockroachDB: serializability via HLC (no special hardware), self-hosted or cloud, native multi-region
  • Aurora: session + eventual consistency, AWS managed only, read replicas with Aurora Global for cross-region
  • Write latency: Spanner 10-50ms cross-region, CockroachDB 20-100ms, Aurora 5-20ms single region
  • Spanner has the strongest consistency but highest price; CockroachDB offers similar semantics with open-source option
16. What happens during a TrueTime hardware failure, and how does it affect read behavior?

Expected answer points:

  • When TrueTime hardware fails, clock uncertainty grows until the hardware is replaced
  • Reads can return slightly stale data while uncertainty is elevated
  • Spanner continues operating but consistency guarantees weaken temporarily
  • Mitigation: monitor clock synchronization closely and replace TrueTime hardware quickly when errors surface
17. What are the best practices for schema changes in Spanner, especially on large tables?

Expected answer points:

  • Add one column at a time rather than batching multiple changes
  • Avoid large DEFAULT values that increase write amplification
  • Use NOT NULL only when truly required, since it forces a full table rewrite
  • Schedule large schema changes during low-traffic windows when possible
  • Online schema changes via ALTER TABLE have no downtime but can take time on very large tables
18. What causes transaction conflict rate spikes in Spanner, and how should you mitigate them?

Expected answer points:

  • More retries and latency when concurrent transactions touch overlapping key ranges
  • Hotspots from poor primary key design or popular parent keys in interleaved tables exacerbate conflicts
  • Mitigation: prefer read-only transactions where feasible (no locks, no conflicts)
  • Structure key schema to reduce write contention: spread writes across key space
  • Consider partitioned transactions for large batch operations
19. How does Spanner's read lease mechanism work, and how does it improve read latency?

Expected answer points:

  • The Paxos leader handles all reads and writes by default
  • Read leases allow follower replicas to serve reads locally for a bounded period
  • During the lease period, clients can read from the nearest follower without contacting the leader
  • This cuts latency significantly for read-heavy workloads where slight staleness is acceptable
  • Lease duration is managed by Paxos; followers must renew leases to continue serving local reads
20. What factors determine whether to choose Spanner over PostgreSQL with read replicas for a given workload?

Expected answer points:

  • Choose Spanner when: you need writes in multiple geographic regions simultaneously, strict serializability across regions, or automatic horizontal scaling without application-level sharding
  • Choose PostgreSQL with read replicas when: workload is single-region, cost-sensitive, or does not need global distribution
  • Spanner's managed operations eliminate database administration overhead but at premium pricing
  • PostgreSQL offers advanced query optimization and more mature ecosystem
  • Decision framework: need horizontal scale + global distribution + strict consistency + budget = Spanner; otherwise PostgreSQL

Further Reading


Operational Guidance

Backup and Restore

Spanner handles backups automatically with no user intervention. On-demand backups are available through the Cloud Console or the spanner_backup.sh script. You can restore to any point within the last 7 days using point-in-time recovery.

Cross-region backups need a manual setup: export Spanner data to Avro or Parquet in Cloud Storage, then copy files to your target region. Budget engineering time for this if you have strict data residency requirements.

Schema Changes

Spanner supports online schema changes via ALTER TABLE statements. There is no downtime for schema changes, but they go through an internal schema upgrade process that can take time on very large tables.

# Online schema change with ALTER TABLE
operation = client.update_database_ddl(
    database_id,
    ['ALTER TABLE Users ADD COLUMN last_login TIMESTAMP']
)
operation.result()  # Waits for completion

Schema change best practices:

  • Add one column at a time rather than batching multiple changes
  • Avoid large DEFAULT values that increase write amplification
  • Use NOT NULL only when truly required, since it forces a full table rewrite
  • Schedule large schema changes during low-traffic windows when possible

Import and Export

Cloud Dataflow handles bulk import and export. Spanner exports to Avro or Parquet in Cloud Storage. For imports, point Dataflow at your Cloud Storage files and let Spanner do the distributed write.

# Cloud Dataflow export job configuration (pseudocode)
export_job = {
    "target_project": "my-project",
    "target_instance": "my-instance",
    "target_database": "my-db",
    "output_prefix": "gs://my-bucket/spanner-export/",
    "output_format": "AVRO",  # or PARQUET
    "batch_size": 10000
}

For small datasets, the Cloud Console provides a no-code import wizard. For production workloads, always use Dataflow with appropriate batch sizing.


Cost Modeling

Spanner pricing has three main components: storage, compute, and networking egress.

Storage

Storage is charged at approximately $0.17 per GB per month for regional storage. Multi-region storage costs more, around $0.25 per GB per month. Storage includes all data, indexes, and backup overhead.

Compute

Compute is priced per vCPU-hour. A regional configuration costs roughly $0.045 per vCPU-hour. Multi-region configurations (necessary for global write distribution) cost approximately $0.09 per vCPU-hour, roughly double the regional rate.

Networking Egress

Networking charges apply when data leaves Google Cloud infrastructure. Cross-region writes and reads within Spanner are generally cheap, but egress to the public internet or to other cloud providers adds up quickly.

Cost Comparison Example

ConfigurationStorage (1TB)Compute (16 vCPU)Estimated Monthly Cost
Single-region$170/month$26/month (regional)~$196/month
Multi-region (US)$170/month$52/month (multi-region)~$222/month
Multi-region (global)$250/month$104/month~$354/month

These numbers assume 24/7 operation at full compute capacity. Actual costs vary with utilization.

Cost Estimation Pseudocode

def estimate_monthly_spanner_cost(
    storage_gb: float,
    vcpus: int,
    region_type: str = "regional",  # "regional" | "multi-region-us" | "multi-region-global"
    egress_gb: float = 0
) -> dict:
    # Storage pricing per GB/month
    storage_prices = {
        "regional": 0.17,
        "multi-region-us": 0.20,
        "multi-region-global": 0.25
    }
    # Compute pricing per vCPU-hour
    compute_prices = {
        "regional": 0.045,
        "multi-region-us": 0.07,
        "multi-region-global": 0.09
    }
    # Egress pricing per GB (approximate)
    egress_rate = 0.12

    hours_per_month = 730  # Average month

    storage_cost = storage_gb * storage_prices[region_type]
    compute_cost = vcpus * compute_prices[region_type] * hours_per_month
    egress_cost = egress_gb * egress_rate

    return {
        "storage": round(storage_cost, 2),
        "compute": round(compute_cost, 2),
        "egress": round(egress_cost, 2),
        "total": round(storage_cost + compute_cost + egress_cost, 2)
    }

# Example: 1TB storage, 16 vCPU, multi-region US, 100GB egress
cost = estimate_monthly_spanner_cost(
    storage_gb=1024,
    vcpus=16,
    region_type="multi-region-us",
    egress_gb=100
)
# cost["total"] ~ $270/month

Plug your numbers into the Google Cloud Pricing Calculator for real estimates. One practical note: Spanner costs scale linearly with nodes, so you can start small for dev/test environments and scale up when you go to production.


Conclusion

Spanner combines SQL with horizontal scalability and strong consistency. It shows you do not have to throw away ACID guarantees to get NoSQL-like scaling. TrueTime hardware is the main barrier to competition, but the ideas have shaped how engineers think about distributed consistency.

For globally distributed apps needing strong consistency, Spanner is worth the cost. For others, studying its architecture teaches you the real trade-offs in distributed database design.

Category

Related Posts

Amazon DynamoDB: Scalable NoSQL with Predictable Performance

Deep dive into Amazon DynamoDB architecture, partitioned tables, eventual consistency, on-demand capacity, and the single-digit millisecond SLA.

#distributed-systems #databases #amazon

Google Chubby: The Lock Service That Inspired ZooKeeper

Explore Google Chubby's architecture, lock-based coordination, Paxos integration, cell hierarchy, and its influence on distributed systems design.

#distributed-systems #databases #google

Synchronous Replication: Data Consistency Across Nodes

Learn about synchronous replication patterns in distributed databases, including Quorum-based replication and write-ahead log shipping for zero RPO.

#distributed-systems #replication #consistency