Synchronous Communication: REST, gRPC, and When to Use Each
Explore synchronous communication patterns in microservices including REST APIs, gRPC, when to use each protocol, and their trade-offs.
Synchronous Communication: REST, gRPC, and When to Use Each
Microservices do not automatically solve your problems. They just move the difficulty around. One of the first design decisions you face is how services communicate. Synchronous communication is the most straightforward approach: a client sends a request, waits for the reply, then continues. Here I will explore the two main protocols for synchronous communication, where each one makes sense, and how to avoid turning a simple request into a cascade failure.
Introduction
The idea is straightforward. Service A calls Service B. Service A stops and does nothing until Service B responds. Only then does Service A resume.
This is the request-response model that software has used for decades before anyone used the word “microservice.” The appeal is predictability. You know immediately whether an operation succeeded or failed. Your business logic stays linear and easy to follow.
The problem is coupling. Service A depends on Service B being available and responsive. If Service B slows down, Service A waits. If Service B fails, Service A fails. This tight coupling is why synchronous systems fail so spectacularly when things go wrong.
Synchronous communication works well when operations are fast, services are reliable, and you need immediate consistency. The moment any of those assumptions break, you start dealing with timeouts, retries, and cascading errors.
REST over HTTP: When to Use It
REST is nearly everywhere. HTTP is not going anywhere, and every tool, language, and framework speaks it. You can test a REST endpoint with curl in your terminal. The format is human-readable JSON. No code generation required.
This ubiquity is REST’s main advantage. For public APIs consumed by third-party developers, it is the obvious choice. An external team can integrate without installing special tooling or learning a new protocol. The documentation writes itself because REST endpoints map naturally to resource-based URL structures.
Browser-based clients work naturally with REST. Browsers understand HTTP methods and status codes without help. You do not need a proxy layer or special configuration. This changes if you ever need to call services from browser JavaScript directly.
REST is also the better choice when schema flexibility matters. JSON accepts extra fields without breaking. You can evolve your API gradually without forcing all clients to update simultaneously. This matters in organizations where coordinating contract changes across teams takes time.
The tradeoff is that REST offers no compile-time checking. Rename a JSON field and you will not know something broke until runtime—probably in production, when a client sends the old field name and your code ignores it silently.
gRPC: When to Use It
Google built gRPC because REST left certain problems unsolved. It uses HTTP/2 for transport and Protocol Buffers for serialization. The combination handles higher throughput than JSON over HTTP/1.1 and enables patterns REST cannot support.
For internal services you control end-to-end, gRPC shines. You define your API contract once in a Protocol Buffer file. Code generation produces client libraries for every language your services use. When you change a field name, every consumer fails at compile time rather than runtime. Your CI pipeline catches breaking changes before they reach production.
Protocol Buffers produce smaller messages than JSON. HTTP/2 multiplexes multiple requests over a single connection, eliminating the head-of-line blocking that HTTP/1.1 suffers. For high-throughput services handling millions of requests per second, these optimizations add up.
Bi-directional streaming is gRPC’s most powerful feature. A client can send a stream of requests while receiving a stream of responses. This works for real-time data pipelines, collaborative editing, monitoring systems that push continuous updates. REST has no equivalent without workarounds like WebSockets or server-sent events.
The catch is browser support. You cannot call gRPC from browser JavaScript directly. You need a proxy like grpc-web that translates between the browser’s limited HTTP semantics and gRPC’s full capabilities. This is a real constraint if browser clients are part of your architecture.
Comparing REST and gRPC
The two approaches make different trade-offs. Here is how they compare on the factors that matter:
| Aspect | REST | gRPC |
|---|---|---|
| Serialization | JSON (human-readable) | Protocol Buffers (binary, compact) |
| Transport | HTTP/1.1 or HTTP/2 | HTTP/2 only |
| Schema Enforcement | None | Strong (generated code) |
| Browser Support | Native | Limited (needs grpc-web proxy) |
| Streaming | Server-sent events, polling | Native bi-directional streaming |
| Tooling | Universal | Requires code generation |
| Debugging | Plain text in logs | Binary needs decoders |
| Contract Evolution | Flexible but risky | Versioned schemas |
REST wins on debugging. You can paste a request into your terminal and read the response directly. gRPC payloads are binary. You need tooling to decode them. Early in development when you are iterating quickly, this matters.
gRPC wins on safety. Schema enforcement catches entire classes of bugs that JSON allows. When your CI pipeline fails because someone renamed a Protocol Buffer field, that is a feature. Runtime surprises are harder to debug than compile-time failures.
Browser clients tip the balance toward REST. If you need to call services from web browsers, gRPC requires additional infrastructure. Plan for this constraint early.
When to Use / When Not to Use Synchronous Communication
Trade-off Table
| Scenario | Use Synchronous | Use Asynchronous Instead |
|---|---|---|
| Need immediate consistency | REST or gRPC | Message queues, events |
| Operations complete in < 100ms | REST or gRPC | Consider async overhead |
| Long-running operations (seconds+) | Avoid sync | Webhooks, callbacks, polling |
| Multiple services in a chain | Add timeouts, circuit breakers | Decompose or use async |
| Fault isolation required | Avoid deep chains | Fire-and-forget events |
| High availability requirement | Add resilience patterns | Inherently more available |
| Cross-service transactions | Avoid, use sagas | Use saga pattern |
When to Use REST
Use REST when:
- Building public APIs consumed by external developers
- Browser-based clients need direct service access
- JSON schema flexibility is needed for API evolution
- Human readability matters for debugging
- Rapid prototyping and iteration are priorities
- Team lacks experience with code generation tools
Avoid REST when:
- You need bi-directional streaming
- Compile-time type safety is critical
- Message size and performance are paramount
- Internal services with shared contracts benefit from schema enforcement
When to Use gRPC
Use gRPC when:
- Internal service-to-service communication you control
- High throughput and low latency are requirements
- Bi-directional streaming is needed (real-time pipelines, collaborative editing)
- You want compile-time contract enforcement
- Multiple languages need consistent client libraries
Avoid gRPC when:
- Browser clients need to call services directly
- Human debugging in transit is important
- JSON-based legacy integration is required
- Team lacks familiarity with Protocol Buffers
Request-Response Patterns
Synchronous calls follow patterns that determine how your services interact.
Point-to-point is the simplest case. One service calls another, waits, and continues. Fast operations that do not involve multiple services work well with this pattern.
Chained requests span multiple services in sequence. Service A calls Service B, which calls Service C. Latency accumulates across each hop. If any service slows down, the entire chain slows. If any service fails, the failure propagates back up. Deep call chains are fragile.
Scatter-gather fans out to multiple services simultaneously. A request goes to Service B, C, and D at the same time. The caller waits for all responses. This reduces total latency compared to chaining, but requires more infrastructure to manage the fan-out and handle partial failures.
Understanding these patterns helps you design APIs that match your reliability requirements. Not every operation belongs in a long chain.
Synchronous Failure Flow
Synchronous systems fail in predictable but dangerous ways. Here is what happens when Service C experiences latency.
sequenceDiagram
participant C as Client
participant SA as Service A
participant SB as Service B
participant SC as Service C
participant DB as Database
C->>SA: GET /order/123
SA->>SB: Verify customer
SB->>SC: Check credit limit
SC-->>SB: (slow) Waiting...
SB-->>SA: Timeout after 5s
SA-->>C: 504 Gateway Timeout
Note over SC: Service C recovers
SC-->>SB: Credit OK
SB-->>SA: Customer OK
SA-->>C: (retry) GET /order/123
SA->>DB: Fetch order
DB-->>SA: Order data
SA-->>C: 200 OK
In this cascade failure, Service C slows down and causes timeouts all the way back to the client. The client eventually retries and succeeds, but only after experiencing a failure.
Circuit Breaker Failure Flow
Circuit breakers prevent cascade failures by failing fast when a downstream service is unhealthy.
stateDiagram-v2
[*] --> Closed: Normal operation
Closed --> Open: Failure threshold exceeded
Open --> HalfOpen: Timeout expired
HalfOpen --> Closed: Probe succeeds
HalfOpen --> Open: Probe fails
state Closed {
[*] --> Normal
Normal --> HighLatency: Slow responses
HighLatency --> Normal: Latency recovers
HighLatency --> Failing: Failure threshold
Failing --> Normal: Recovery succeeds
}
Circuit breakers wrap synchronous calls and monitor failure rates. When failures exceed a threshold, the circuit opens and calls fail immediately without hitting the unhealthy service. After a cooldown period, a probe call tests whether the service has recovered.
Timeouts and Retry Considerations
Networks fail. Services crash. Load spikes cause timeouts. Your synchronous code must handle these cases explicitly.
Every synchronous call needs a timeout. Without one, a slow service blocks your service indefinitely. Setting timeouts requires knowing your SLAs and typical response times. Too short and you fail incorrectly. Too long and you defeat the purpose of failing fast.
Start conservative and adjust based on production data. Monitor your p99 response times. If p99 is 200ms, a 500ms timeout gives room for spikes without waiting forever on genuine failures.
Retries recover from transient failures, but they amplify problems if not handled carefully. Exponential backoff prevents overwhelming a struggling service. Circuit breakers stop retry storms when a service is genuinely down.
Retries are not free. They consume resources on both sides. They can turn one service’s problem into a system-wide outage. When you retry, the same request may execute multiple times. Idempotency is essential—the operation must produce the same result regardless of how many times it runs.
# Timeout and retry example
import httpx
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=1, max=10))
async def call_service_with_retry(url: str) -> dict:
async with httpx.AsyncClient(timeout=5.0) as client:
response = await client.get(url)
response.raise_for_status()
return response.json()
Circuit Breaker Implementation
import time
import asyncio
from enum import Enum
class CircuitState(Enum):
CLOSED = "closed"
OPEN = "open"
HALF_OPEN = "half_open"
class CircuitBreaker:
def __init__(self, failure_threshold=5, timeout=60, recovery_timeout=30):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.recovery_timeout = recovery_timeout
self.failure_count = 0
self.last_failure_time = None
self.state = CircuitState.CLOSED
async def call(self, func, *args, **kwargs):
if self.state == CircuitState.OPEN:
if time.time() - self.last_failure_time > self.recovery_timeout:
self.state = CircuitState.HALF_OPEN
else:
raise Exception("Circuit breaker is OPEN")
try:
result = await func(*args, **kwargs)
self._on_success()
return result
except Exception as e:
self._on_failure()
raise e
def _on_success(self):
self.failure_count = 0
self.state = CircuitState.CLOSED
def _on_failure(self):
self.failure_count += 1
self.last_failure_time = time.time()
if self.failure_count >= self.failure_threshold:
self.state = CircuitState.OPEN
Point-to-Point Client Implementation
import httpx
from typing import Optional
class ServiceClient:
def __init__(self, base_url: str, timeout: float = 5.0, max_retries: int = 3):
self.base_url = base_url.rstrip("/")
self.timeout = timeout
self.max_retries = max_retries
async def get(self, path: str, params: Optional[dict] = None) -> dict:
url = f"{self.base_url}/{path.lstrip('/')}"
async with httpx.AsyncClient(timeout=self.timeout) as client:
for attempt in range(self.max_retries):
try:
response = await client.get(url, params=params)
response.raise_for_status()
return response.json()
except httpx.TimeoutException:
if attempt == self.max_retries - 1:
raise
except httpx.HTTPStatusError as e:
if e.response.status_code >= 500:
if attempt == self.max_retries - 1:
raise
else:
raise
async def post(self, path: str, json: Optional[dict] = None) -> dict:
url = f"{self.base_url}/{path.lstrip('/')}"
async with httpx.AsyncClient(timeout=self.timeout) as client:
response = await client.post(url, json=json)
response.raise_for_status()
return response.json()
Common Pitfalls / Anti-Patterns
Synchronous communication couples services by availability and latency. That coupling has costs.
High latency operations do not fit synchronous patterns. If an operation takes seconds to complete, blocking the caller is impractical. A user staring at a spinner for thirty seconds is a bad experience. Asynchronous patterns—initiate the operation, poll for completion, or receive a callback—work better for long-running operations.
Cascade failures spread when one service failure propagates to others. Service A calls Service B, which calls Service C. If Service C slows down, Service B waits. Service A times out. Users see errors across the system even though only one service has a problem. Without circuit breakers and bulkheads, synchronous systems amplify failures.
Distributed transactions across multiple services are notoriously difficult with synchronous calls. When an operation spans services and must be atomic, synchronous rollback is messy. Asynchronous saga patterns handle this better, though they introduce their own complexity.
Loose coupling sometimes matters more than simplicity. If services need to evolve independently, adding a message broker decouples release cycles. Service A does not need to know when Service B deploys a new version. Asynchronous events let services communicate without direct knowledge of each other.
Evaluate these factors before defaulting to synchronous communication. The simplicity of request-response has hidden costs in the right scenarios.
Synchronous Request Flow
Here is what a synchronous request looks like in practice.
sequenceDiagram
participant C as Client
participant G as API Gateway
participant S1 as Service A
participant S2 as Service B
participant DB as Database
C->>G: HTTP Request
G->>S1: Forward Request
S1->>DB: Query Data
DB-->>S1: Return Results
S1->>S2: Call Service B
S2-->>S1: Return Response
S1-->>G: HTTP Response
G-->>C: Return to Client
Each arrow in this diagram is a potential failure point and a source of latency. Monitoring helps identify where bottlenecks occur.
Production Failure Scenarios
Synchronous systems fail in predictable ways across different scales and contexts.
The Netflix Chaos Experiment: Netflix introduced controlled failure injection to expose brittleness. One missing timeout on a dependent service caused entire application regions to go offline. The fix was simple: enforce timeouts everywhere. The discovery that one unguarded call could cascade through hundreds of instances changed how they thought about synchronous dependencies entirely.
The AWS DynamoDB Latency Spike (2012): A single-digit percentage increase in latency on a foundational storage service caused request timeouts across AWS. Applications that had configured aggressive timeouts failed immediately. Applications with proper timeout strategies and retry backoff recovered gracefully. The difference was not the underlying failure but whether services trusted the downstream system blindly or protected themselves explicitly.
The 2019 GitHub outage: A scheduled maintenance window triggered a cascading failure when a dependent service took longer than expected to restart. Synchronous health checks waiting for full recovery blocked request processing. The lesson: synchronous initialization sequences create single points of failure even for services that appear independent.
The Banking API Timeout Cascade: A payment processing API exposed to third-party developers introduced stricter timeout enforcement after a partner service degraded. Partner applications with hardcoded 30-second timeout expectations began failing in ways that propagated to their downstream customers. The synchronous contract between systems had silently assumed infinite patience.
The IoT Device Storm: A firmware update caused IoT devices to begin sending heartbeat requests synchronously every second instead of every 30 seconds. Backend services receiving 30x normal request volume began timing out on unrelated operations. The synchronous nature of the health checks meant the recovery required coordinated updates across hundreds of thousands of devices before load normalized.
Quick Recap
graph LR
Client -->|HTTP Request| Gateway
Gateway -->|REST/gRPC| ServiceA
ServiceA -->|Sync Call| ServiceB
ServiceB -->|Response| ServiceA
ServiceA -->|Response| Gateway
Gateway -->|HTTP Response| Client
Key Points
- Synchronous communication provides immediate consistency but creates tight coupling
- REST offers human-readable debugging and universal browser support
- gRPC provides type safety, bi-directional streaming, and compile-time contract enforcement
- Always configure timeouts to prevent blocking on slow or failed services
- Circuit breakers prevent cascade failures from spreading through the call chain
- Retries amplify problems if not combined with idempotency and backoff
- Correlation IDs enable tracing requests through the entire call chain
When to Choose Synchronous
- Operations complete in under 100ms with predictable latency
- You need immediate consistency between services
- The call chain is shallow (2-3 services maximum)
- Your team can manage resilience patterns consistently
- Debugging simplicity outweighs loose coupling benefits
Production Checklist
# Synchronous Communication Production Readiness
- [ ] Timeouts configured for all outbound calls
- [ ] Retry logic with exponential backoff implemented
- [ ] Circuit breakers protecting downstream calls
- [ ] Correlation IDs propagated through call chains
- [ ] Health check endpoints on all services
- [ ] Structured logging with latency metrics
- [ ] Alerting configured for timeout and error rates
- [ ] Graceful degradation patterns in place
- [ ] Load shedding when downstream services are slow
- [ ] Request budgets limiting retry amplification
Observability Hooks
Synchronous systems require explicit observability instrumentation. Unlike async systems where failures are queued, sync failures are immediate and visible.
Request Correlation
Every synchronous request should carry a correlation ID through the call chain.
import httpx
import uuid
from contextvars import ContextVar
correlation_id: ContextVar[str] = ContextVar("correlation_id", default="")
async def correlated_get(url: str, headers: dict = None) -> httpx.Response:
cid = correlation_id.get()
if not cid:
cid = str(uuid.uuid4())
correlation_id.set(cid)
request_headers = {**(headers or {}), "X-Correlation-ID": cid}
async with httpx.AsyncClient() as client:
return await client.get(url, headers=request_headers)
Key Metrics to Track
| Metric | Purpose | Alert Threshold |
|---|---|---|
| Request latency p50/p95/p99 | Baseline performance | p99 > SLA |
| Error rate by endpoint | Service health | > 1% for 5min |
| Timeout rate | Downstream health | > 10% |
| Circuit breaker state | Resilience activation | OPEN state |
| Retry rate | Transient failures | > 20% |
Logging Structured Data
import structlog
import time
logger = structlog.get_logger()
async def logged_service_call(service: str, operation: str, func, *args, **kwargs):
start = time.time()
correlation_id = correlation_id.get()
logger.info(
"service_call_started",
service=service,
operation=operation,
correlation_id=correlation_id
)
try:
result = await func(*args, **kwargs)
duration = time.time() - start
logger.info(
"service_call_completed",
service=service,
operation=operation,
duration_ms=int(duration * 1000),
correlation_id=correlation_id
)
return result
except Exception as e:
duration = time.time() - start
logger.error(
"service_call_failed",
service=service,
operation=operation,
duration_ms=int(duration * 1000),
error=str(e),
correlation_id=correlation_id
)
raise
Health Check Pattern
from fastapi import FastAPI
import httpx
app = FastAPI()
@app.get("/health")
async def health_check():
checks = {}
healthy = True
# Check downstream services
for service_name, service_url in downstream_services.items():
try:
async with httpx.AsyncClient(timeout=2.0) as client:
response = await client.get(f"{service_url}/health")
checks[service_name] = {"status": "up", "latency_ms": response.elapsed}
except Exception as e:
checks[service_name] = {"status": "down", "error": str(e)}
healthy = False
return {"status": "healthy" if healthy else "unhealthy", "checks": checks}
Interview Questions
Expected answer points:
- REST uses JSON serialization while gRPC uses Protocol Buffers (binary format)
- REST typically runs over HTTP/1.1 or HTTP/2, while gRPC requires HTTP/2 exclusively
- Protocol Buffers produce smaller messages and offer stronger schema enforcement
- REST payloads are human-readable; gRPC requires tooling to decode binary format
Expected answer points:
- Circuit breakers monitor failure rates on downstream service calls
- When failures exceed a threshold, the circuit "opens" and calls fail immediately without hitting the unhealthy service
- After a cooldown period, a probe call tests whether the service has recovered
- Three states: Closed (normal operation), Open (failing fast), Half-Open (testing recovery)
Expected answer points:
- When browser clients need to call services directly (gRPC requires grpc-web proxy)
- When human debugging in transit is important (JSON is readable, binary needs decoders)
- When JSON schema flexibility matters for gradual API evolution
- When teams lack experience with Protocol Buffers and code generation tools
- When debugging during early development iterations takes priority over performance
Expected answer points:
- Exponential backoff increases wait time between retry attempts (e.g., 1s, 2s, 4s, 8s)
- It prevents overwhelming a struggling service with rapid retry storms
- Without backoff, retries can amplify a problem and turn one service's issue into a system-wide outage
- Combined with jitter, it prevents synchronized retry waves from multiple clients
Expected answer points:
- Long-running operations where blocking the caller for seconds is impractical
- High availability requirements where loose coupling matters more than simplicity
- Cross-service transactions requiring atomicity (sagas handle this better async)
- Services that need to evolve independently without coordinating release cycles
- Deep call chains where latency accumulates across multiple hops
Expected answer points:
- Point-to-point: one service calls another directly, waits, and continues (simplest pattern)
- Scatter-gather: a request fans out to multiple services simultaneously and waits for all responses
- Scatter-gather reduces total latency compared to chaining but requires more infrastructure
- Scatter-gather must handle partial failures when some services respond while others fail
Expected answer points:
- When retrying, the same request may execute multiple times on the server
- If the operation is not idempotent, duplicate retries cause incorrect results (e.g., double charges)
- Idempotent operations produce the same result regardless of how many times they execute
- Techniques include deduplication keys, conditional updates, and checking state before acting
Expected answer points:
- HTTP/1.1 suffers from head-of-line blocking where each request must complete before the next starts
- HTTP/2 multiplexes multiple requests over a single connection simultaneously
- This eliminates the queuing bottleneck for high-throughput services handling millions of requests per second
- Protocol Buffers smaller message sizes compound this advantage for bandwidth-constrained environments
Expected answer points:
- A correlation ID is a unique identifier attached to a request and propagated through the entire call chain
- It enables tracing a single request as it moves through multiple services
- When a failure occurs, logs can be searched by correlation ID to reconstruct the full request path
- Without correlation IDs, debugging distributed synchronous failures is significantly harder
Expected answer points:
- Too short: services fail incorrectly during legitimate slow operations; causes unnecessary retry traffic
- Too long: slow services block callers indefinitely; defeats the purpose of failing fast
- Proper timeout requires knowing your SLAs and typical response times (monitor p99)
- Start conservative and adjust based on production data rather than guessing upfront
Expected answer points:
- Combine timeouts, retries with exponential backoff, and circuit breakers in layers
- Implement bulkheads to isolate failures and prevent them from affecting unrelated operations
- Use fallback responses where possible (cached data, default values, degraded functionality)
- Apply load shedding when downstream services are slow to prevent system overload
- Propagate correlation IDs for distributed tracing across all calls
Expected answer points:
- Long-running operations block caller threads, exhausting thread pools under high load
- Users experience poor UX with spinners or loading indicators for seconds or minutes
- Deep chains amplify latency: 5 services at 200ms each means 1 second total latency minimum
- Better alternatives: async patterns (webhooks, callbacks, polling), or break into smaller synchronous chunks
- If sync is unavoidable, use separate thread pools with bounded queues to prevent cascade effects
Expected answer points:
- HTTP/1.1: head-of-line blocking forces sequential requests; each connection handles one request at a time
- HTTP/2: multiplexes multiple requests over single connection; removes head-of-line blocking but still has TCP-level issues
- HTTP/3 (QUIC): runs over UDP, eliminates TCP handshake overhead, handles packet loss better (streams recover independently)
- gRPC uses HTTP/2 exclusively; REST works across all HTTP versions
- HTTP/3 reduces connection establishment time with 0-RTT resumption, improving mobile network performance
Expected answer points:
- API gateways aggregate multiple service calls into a single request, reducing client round-trips
- They handle cross-cutting concerns: authentication, authorization, rate limiting, request logging
- Gateways can transform protocols (REST to gRPC) and massage payloads for version compatibility
- They provide a single entry point for monitoring and enforcing API contracts
- Risk: gateway becomes a single point of failure; use redundancy and circuit breakers at the gateway level
Expected answer points:
- When fanning out to multiple services, some may succeed while others fail
- Timeout strategy: wait for all responses, or use a threshold (e.g., require 2 of 3 responses)
- Implement cancellation: stop waiting for stragglers once minimum threshold is met
- Consider partial results: return what succeeded and let caller decide how to handle failures
- Use separate circuit breakers per downstream call so one failure does not affect others
Expected answer points:
- Debugging ease: JSON in logs is readable without special tooling; binary payloads require decoders
- Tooling ecosystem: curl, Postman, browser dev tools work immediately vs. code generation complexity
- Team familiarity: less onboarding friction when everyone understands REST conventions
- HTTP/1.1 compatibility: REST works with proxies and load balancers that don't support HTTP/2
- Performance gains from gRPC may not justify the operational complexity for low-throughput services
Expected answer points:
- Services trust each other by default; compromised service can attack other services directly
- Use mTLS (mutual TLS) to authenticate both ends of the connection
- Implement service-to-service authorization (not just authentication) to limit blast radius
- Rate limiting between services prevents one service from overwhelming another
- Correlation IDs enable audit trails but must not expose sensitive data in headers
Expected answer points:
- Each synchronous call holds a connection (or thread) while waiting for downstream response
- Deep call chains compound this: Service A holds connection, calls Service B which queries DB, multiplying pool pressure
- Connection pool exhaustion causes cascading failures across services
- Async I/O allows more concurrent requests per connection; sync requires larger pools or fewer concurrent calls
- Load testing with realistic concurrency patterns is essential before production deployment
Expected answer points:
- Distributed tracing tracks a request across service boundaries using correlation IDs or trace context
- In sync systems, trace context propagates inline through each call; async uses message headers
- Tools like Jaeger, Zipkin, AWS X-Ray visualize the full call chain and identify bottlenecks
- Trace data helps determine if latency is at the network, service processing, or downstream call level
- Sampling strategies balance observability overhead against storage costs for high-volume systems
Expected answer points:
- p99 latency: captures worst-case experiences that mean users drop off; p50 hides outliers
- Error rate by endpoint: distinguishes temporary blips from systematic failures
- Timeout rate: early warning that downstream services are degrading
- Circuit breaker state transitions: indicates resilience patterns activating (should not be frequent)
- Retry rate: high retry rates signal underlying instability; exceeding 20% warrants investigation
Further Reading
- RESTful API Design — Principles for designing intuitive REST APIs
- API Gateway Patterns — How API gateways handle routing, authentication, and rate limiting
- Circuit Breaker Pattern — Preventing cascade failures in distributed systems
- Resilience Patterns — Comprehensive guide to building fault-tolerant systems
Conclusion
Synchronous communication is not obsolete. It is the right tool for problems where simplicity and immediate consistency matter more than loose coupling. REST remains the standard for external APIs and browser-facing services. gRPC delivers performance and type safety for internal service communication where you control both ends.
Most organizations end up using both. External-facing APIs in REST, internal gRPC for service-to-service calls. The key is matching the protocol to the constraints of each interaction.
Build resilience into synchronous systems from the start. Timeouts, retries, and circuit breakers are not optional add-ons. Without them, small failures become large outages.
Category
Related Posts
Amazon Architecture: Lessons from the Pioneer of Microservices
Learn how Amazon pioneered service-oriented architecture, the famous 'two-pizza team' rule, and how they built the foundation for AWS.
API Contracts: Design, Versioning, and Contract Testing
Master API contract design for microservices including OpenAPI specs, semantic versioning strategies, and automated contract testing.
Asynchronous Communication in Microservices: Events and Patterns
Deep dive into asynchronous communication patterns for microservices including event-driven architecture, message queues, and choreography vs orchestration.