GraphQL vs REST: Choosing the Right API Paradigm

Compare GraphQL and REST APIs, understand when to use each approach, schema design, queries, mutations, and trade-offs between the two paradigms.

published: March 22, 2026 reading time: 36 min read author: GeekWorkBench

Introduction

GraphQL came out of Facebook in 2012 and reached open source in 2015. The pitch: instead of multiple endpoints returning fixed data structures, you get one endpoint where the client asks for exactly what it needs.

That sounds simple, but it changes how you think about APIs. REST is resource-oriented — you fetch what the server decides. GraphQL is query-oriented — the client decides. Each model has strengths and trade-offs.

graph LR
    A[Client] -->|"POST /graphql"| B[GraphQL Server]
    B --> C[Schema]
    C --> D[Resolvers]
    D --> E[Data Sources]

The client specifies exactly what data it needs in the query. The server returns exactly that data, nothing more.

Core Concepts

REST: Multiple Endpoints

REST uses different endpoints for different resources:

# REST: Multiple endpoints
GET /users/123
GET /users/123/posts
GET /users/123/followers

Each endpoint returns a fixed data structure. If you need a user’s posts with follower counts, you might need multiple requests or get more data than necessary.

GraphQL: Single Endpoint

GraphQL uses a single endpoint with flexible queries:

# GraphQL: Single endpoint, flexible query
POST /graphql

query {
  user(id: 123) {
    name
    posts {
      title
    }
    followersCount
  }
}

One request gets exactly the data you need.

Query Patterns

Queries

REST Queries

# REST: Get user and their posts
GET /users/123
GET /users/123/posts

GraphQL Queries

# GraphQL: One request, precise data
query GetUserWithPosts($userId: ID!) {
  user(id: $userId) {
    name
    email
    posts {
      title
      createdAt
    }
  }
}

GraphQL queries run in parallel automatically. If you request multiple fields, GraphQL fetches them concurrently.

Mutation Patterns

Mutations

REST Mutations

# REST: Create, update, delete with different HTTP methods
POST /users
PUT /users/123
DELETE /users/123

GraphQL Mutations

# GraphQL: Mutations are explicit
mutation CreateUser($input: CreateUserInput!) {
  createUser(input: $input) {
    id
    name
    email
  }
}

mutation UpdateUser($id: ID!, $input: UpdateUserInput!) {
  updateUser(id: $id, input: $input) {
    id
    name
    email
  }
}

mutation DeleteUser($id: ID!) {
  deleteUser(id: $id)
}

Subscription Architecture

GraphQL Subscriptions Deep Dive

GraphQL has native Subscriptions for real-time updates, unlike REST which relies on polling or webhooks.

Subscription Protocol

Subscriptions use WebSockets under the hood. The client subscribes once, and the server pushes updates:

sequenceDiagram
    Client->>Server: SUBSCRIBE mutation (WebSocket)
    Server-->>Client: Confirm subscription
    Server->>Database: Watch for changes
    Database-->>Server: New data event
    Server-->>Client: Push: { data: { postCreated: {...} } }
    Server->>Database: Continue watching
    Database-->>Server: Another event
    Server-->>Client: Push: { data: { postCreated: {...} } }

Schema Definition

type Subscription {
  postCreated: Post!
  postUpdated(id: ID!): Post!
  userJoined(roomId: ID!): User!
}

type Mutation {
  createPost(input: CreatePostInput!): Post!
  updatePost(id: ID!, input: UpdatePostInput!): Post!
  joinRoom(roomId: ID!): Room!
}

type Query {
  posts: [Post!]!
}

Subscription Resolver Implementation

import { PubSub } from 'graphql-subscriptions';

const pubsub = new PubSub();

// Define event names as constants
const POST_CREATED = 'POST_CREATED';
const POST_UPDATED = 'POST_UPDATED';

// Resolvers
const resolvers = {
  Subscription: {
    postCreated: {
      subscribe: () => pubsub.asyncIterator([POST_CREATED]),
    },
    postUpdated: {
      subscribe: (_, { id }) => pubsub.asyncIterator(`${POST_UPDATED}_${id}`),
    },
  },
  Mutation: {
    createPost: (_, { input }, { pubsub }) => {
      const post = await db.posts.create(input);

      // Publish the event to all subscribers
      pubsub.publish(POST_CREATED, { postCreated: post });

      return post;
    },
    updatePost: (_, { id, input }, { pubsub }) => {
      const post = await db.posts.update(id, input);

      // Publish to specific post subscribers
      pubsub.publish(`${POST_UPDATED}_${id}`, { postUpdated: post });

      return post;
    },
  },
};

Subscription Production Patterns

Filtering Subscriptions

Not all clients should receive all updates. Filter by authorization, room membership, or other criteria:

type Subscription {
  # Only gets events the user is authorized to see
  documentUpdated(documentId: ID!): Document!
}

const resolvers = {
  Subscription: {
    documentUpdated: {
      subscribe: async function* (_, { documentId }, context) {
        // Check authorization
        if (!await context.user.canView(documentId)) {
          throw new Error('Not authorized');
        }

        // Create async generator that yields when matching events
        const eventEmitter = context.documentEvents.filter(
          event => event.documentId === documentId
        );

        for await (const event of eventEmitter) {
          yield { documentUpdated: event.document };
        }
      },
    },
  },
};

Production Subscription Architecture

graph TB
    Client1[WebSocket Client] --> LB[Load Balancer]
    Client2[WebSocket Client] --> LB
    Client3[WebSocket Client] --> LB
    LB --> Server1[GraphQL Server 1]
    LB --> Server2[GraphQL Server 2]
    Server1 --> RedisPubSub[Redis Pub/Sub]
    Server2 --> RedisPubSub
    RedisPubSub --> MessageBroker[Redis / RabbitMQ]
    MessageBroker --> Server1
    MessageBroker --> Server2

For multi-server deployments, use Redis Pub/Sub or a message broker:

import { RedisPubSub } from "graphql-redis-subscriptions";

const pubsub = new RedisPubSub({
  connection: {
    host: process.env.REDIS_HOST,
    port: 6379,
    retryStrategy: (times) => Math.min(times * 50, 2000),
  },
});

// Use Redis-backed pub/sub for horizontal scaling
const resolvers = {
  Subscription: {
    postCreated: {
      subscribe: () => pubsub.asyncIterator([POST_CREATED]),
    },
  },
};

Subscription Gotchas and Mitigations

Issue	Problem	Mitigation
Memory leaks	Subscriptions hold server resources indefinitely	Set `max`, `maxAge` on PubSub; implement client heartbeat
Reconnection storms	Clients reconnect in burst after outage	Implement exponential backoff; deduplicate on reconnect
Authorization drift	User loses access but keeps subscription	Re-validate authorization periodically; emit “kicked” event
Query complexity	Subscription queries are as complex as regular queries	Apply same complexity limits; consider simplified subscription payloads

// Limit subscription count per connection
const pubsub = new PubSub({
  maxSubscriptionPerConnection: 100,
});

// Set TTL for subscription events
const pubsub = new PubSub({
  eventTTL: 10, // seconds
});

When to Use Subscriptions vs Polling vs Webhooks

Approach	Latency	Scalability	Use Case
Subscriptions	Instant	Medium	UI updates, live collaboration
Polling	Poll interval	High	Infrequent updates, simple clients
Webhooks	Near-instant	High	Cross-service communication
Server-Sent Events	Near-instant	Medium	One-way server push, simpler than WS

For most GraphQL use cases: subscriptions for real-time UI, REST webhooks for cross-service events.

Persisted Queries

Persisted Queries & Query Whitelisting

By default, GraphQL accepts any query string sent by clients. This flexibility is powerful but creates security and performance problems. Persisted queries solve both.

The Problem with Dynamic Queries

Every GraphQL request sends the full query string:

# Every request - even identical ones - sends the full query
POST /graphql
{
  "query": "query GetUser { user(id: $id) { name email posts { title } } }",
  "variables": { "id": "123" }
}

This means:

Security: Attackers can send complex or malicious queries
Performance: Server parses and validates the same queries repeatedly
Bandwidth: Large queries consume unnecessary network overhead

How Persisted Queries Work

Instead of sending the full query, clients send a hash that references a pre-registered query:

# Instead of full query...
POST /graphql
{ "query": "{ user(id: 123) { name email } }" }

# Client sends query ID (SHA-256 hash)
POST /graphql
{ "extensions": { "persistedQuery": { "version": 1, "sha256Hash": "a1b2c3d4e5f6..." } } }

Server looks up the hash, executes the pre-validated query.

Apollo Server Implementation

import { ApolloServer } from "@apollo/server";
import { hashQuery } from "@apollo/utils.usestripping";
import { LocalCache } from "apollo-server-cache-local";

// 1. Define persisted query plugin
const createPersistedQueryPlugin = (queryRegistry) => ({
  async didResolveOperation({ operation, document }) {
    // Skip if client sent full query (add to registry)
    if (!operation.extensions?.persistedQuery) {
      const hash = hashQuery(document);
      queryRegistry.set(hash, operation.document);
    }
  },
});

// 2. Create server with plugin
const server = new ApolloServer({
  typeDefs,
  resolvers,
  plugins: [createPersistedQueryPlugin(queryRegistry)],

  // Reject unknown queries (not in registry)
  allowDynamicPersistedQueries: false, // Default: false in production

  // For development, allow full queries too
  // persistedQueries: {
  //   cache: new LocalCache({ ttl: 3600 }),
  // },
});

Building the Query Registry

// Build-time: generate registry from client queries
// (run as part of your CI/CD build)

import fs from "fs";
import path from "path";
import { hashQuery, parse } from "graphql";

const QUERIES_DIR = "./src/queries";
const REGISTRY_FILE = "./query-registry.json";

function buildQueryRegistry() {
  const registry = {};

  // Find all .graphql files
  const queryFiles = glob.sync(`${QUERIES_DIR}/**/*.graphql`);

  for (const file of queryFiles) {
    const content = fs.readFileSync(file, "utf-8");
    const document = parse(content);

    // Extract operation names
    for (const definition of document.definitions) {
      if (definition.kind === "OperationDefinition") {
        const hash = hashQuery(document); // SHA-256 of operation
        registry[hash] = {
          operationName: definition.name?.value || "anonymous",
          file,
          query: content,
        };
      }
    }
  }

  // Write registry to file (upload to CDN/deployment)
  fs.writeFileSync(REGISTRY_FILE, JSON.stringify(registry, null, 2));

  console.log(`Registered ${Object.keys(registry).length} queries`);
}

buildQueryRegistry();

Client-Side Query Integration

Client-Side Integration

// React Apollo Client
import { createPersistedQueryLink } from "@apollo/client/link/persisted";
import { sha256 } from "crypto-hash";

const persistedLink = createPersistedQueryLink({ sha256 });

// Combine with http link
const httpLink = new HttpLink({ uri: "/graphql" });
const link = persistedLink.concat(httpLink);

// Or with Apollo Client 3+
import { ApolloClient, InMemoryCache, createHttpLink } from "@apollo/client";
import { ApolloLink } from "@apollo/client/link";
import { createPersistedQueryLink } from "@apollo/client/link/persisted";

const httpLink = createHttpLink({ uri: "/graphql" });
const persistedLink = createPersistedQueryLink({ sha256 });
const link = ApolloLink.from([persistedLink, httpLink]);

const client = new ApolloClient({
  cache: new InMemoryCache(),
  link,
});

Query Whitelisting (Strict Mode)

For maximum security, reject any query not in your pre-approved registry:

// Server-side: only allow registered queries
const server = new ApolloServer({
  // ... other config

  // CRITICAL: Reject all unregistered queries
  allowDynamicPersistedQueries: false, // ← This is the key setting

  plugins: [
    {
      async didResolveOperation({ operation }) {
        const { sha256Hash } = operation.extensions?.persistedQuery || {};

        if (!sha256Hash) {
          throw new Error(
            "Persisted query required. Send persistedQuery extension.",
          );
        }

        if (!queryRegistry.has(sha256Hash)) {
          throw new Error(`Unknown query: ${sha256Hash}`);
        }

        // Replace operation with registered version
        operation.document = queryRegistry.get(sha256Hash);
      },
    },
  ],
});

Benefits Summary

Benefit	Without Persisted Queries	With Persisted Queries
Security	Full query injection risk	Only pre-approved queries run
Parse overhead	Every request parsed	Parsed once at build time
Network	Full query string each request	64-char hash only
CDN caching	Not cacheable	Persisted queries can use GET
Rate limiting	Hard to fingerprint	Per-query rate limits possible

When to Use

Use Case	Recommendation
Public API	Required—prevent abuse
Mobile apps	Highly recommended—bandwidth savings
Internal tools	Optional—still helps with parsing overhead
Development	Skip—full queries for flexibility

Data Fetching and N+1 Problem

Data Fetching

Overfetching and Underfetching

REST often leads to overfetching (getting more data than needed) or underfetching (needing multiple requests):

# REST: Gets more data than needed
GET /users/123
# Returns: { id, name, email, created_at, updated_at, profile_url, bio, ... }

# REST: Multiple requests for related data
GET /users/123          # User info
GET /users/123/posts    # User's posts
GET /posts/456/comments # Comments for a specific post

GraphQL solves both problems:

# GraphQL: Exact data
query {
  user(id: 123) {
    name # Only what you need
    posts {
      title # Only what you need
    }
  }
}

N+1 Problem

GraphQL can suffer from the N+1 problem: fetching a list of users, then making a separate request for each user’s posts:

# This could trigger many database queries
query {
  users {
    name
    posts {
      title # Triggers query for each user's posts
    }
  }
}

DataLoader solves this by batching requests.

DataLoader Patterns Deep Dive

The N+1 problem is GraphQL’s most notorious performance pitfall. DataLoader is Facebook’s official solution—a batching and caching library that coalesces multiple requests into fewer database queries.

DataLoader Deep Dive

How DataLoader Works

DataLoader works by queueing up individual field requests during query execution, then dispatching them as a single batched query when the field is accessed.

import DataLoader from "dataloader";

// Create a batch function that fetches users by IDs
const userLoader = new DataLoader(async (ids) => {
  // This runs once for all pending user lookups
  const users = await db.users.findMany({ where: { id: { in: ids } } });

  // DataLoader expects results in the same order as input IDs
  const userMap = new Map(users.map((u) => [u.id, u]));
  return ids.map((id) => userMap.get(id) || null);
});

// In your resolver
const resolvers = {
  User: {
    posts: (user, args, context) => context.postLoader.load(user.id),
  },
  Post: {
    author: (post, args, context) => context.userLoader.load(post.authorId),
  },
};

Batching vs Caching

DataLoader provides two distinct benefits:

// Batching: Multiple users requested in same query
// Query: { users { posts { author } } }
// Only ONE batch call to posts for ALL users
const userLoader = new DataLoader(async (userIds) => {
  // Single query: SELECT * FROM posts WHERE authorId IN (...)
  const allPosts = await Post.findMany({ authorId: { in: userIds } });

  // Group by authorId
  const postsByAuthor = allPosts.reduce((acc, post) => {
    (acc[post.authorId] ||= []).push(post);
    return acc;
  }, {});

  return userIds.map((id) => postsByAuthor[id] || []);
});

// Caching: Same user requested multiple times in query
// Query: { user(id: 1) { author { name } } }
//         { user(id: 1) { posts { title } } }
// The user is loaded once, second request hits cache

Common DataLoader Gotchas

1. Cache key collisions with nullable types

// Problem: null and id:123 could share cache if not careful
// DataLoader uses the key directly—ensure consistent types
const loader = new DataLoader((keys) => batchLoad(keys));

// Good: always return consistent types
load(userId); // string
load(parseInt(id, 10)); // number - could collide!

// Better: normalize to consistent type
const loader = new DataLoader((keys) => batchLoad(keys.map((k) => String(k))), {
  cacheKeyFn: (k) => String(k),
});

2. Memoization vs database freshness

// DataLoader caches per-request by default
// For long-running servers, use Redis/Memcached

import Redis from "ioredis";

const redis = new Redis();

// Custom cache map with TTL
const createPersistentLoader = (batchFn, ttlMs = 60000) => {
  const cache = new Map();

  return new DataLoader(async (keys) => {
    const now = Date.now();
    const results = await Promise.all(
      keys.map(async (key) => {
        const cached = cache.get(key);
        if (cached && now - cached.timestamp < ttlMs) {
          return cached.value;
        }
        return null; // Let DataLoader handle batch miss
      }),
    );

    // Batch load uncached keys
    const uncachedKeys = keys.filter((_, i) => results[i] === null);
    if (uncachedKeys.length > 0) {
      const uncachedResults = await batchFn(uncachedKeys);
      uncachedResults.forEach((value, i) => {
        cache.set(uncachedKeys[i], { value, timestamp: now });
      });
    }

    return results.map((cached, i) =>
      cached !== null
        ? cached.value
        : uncachedResults[uncachedKeys.indexOf(keys[i])],
    );
  });
};

3. Handling partial failures in batches

// Some IDs fail, some succeed - handle gracefully
const userLoader = new DataLoader(async (ids) => {
  const users = await User.findMany({ where: { id: { in: ids } } });
  const userMap = new Map(users.map((u) => [u.id, u]));

  return ids.map((id) => {
    const user = userMap.get(id);
    if (!user) {
      // Return Error for missing, not null
      // This preserves the error in GraphQL response
      return new Error(`User ${id} not found`);
    }
    return user;
  });
});

DataLoader with Different Data Sources

// REST API as a data source
const remoteServiceLoader = new DataLoader(async (ids) => {
  const responses = await Promise.all(
    ids.map((id) => fetch(`/api/users/${id}`).then((r) => r.json())),
  );
  return responses;
});

// MongoDB with aggregation pipeline
const mongoLoader = new DataLoader(async (objectIds) => {
  const results = await User.aggregate([
    { $match: { _id: { $in: objectIds } } },
    {
      $lookup: {
        from: "posts",
        localField: "_id",
        foreignField: "authorId",
        as: "posts",
      },
    },
  ]);

  const resultMap = new Map(results.map((r) => [r._id.toString(), r]));
  return objectIds.map((id) => resultMap.get(id.toString()) || null);
});

Error Handling

Error Handling Overview

REST Errors

REST uses HTTP status codes:

HTTP/1.1 404 Not Found
Content-Type: application/json

{"error": "User not found"}

GraphQL Errors

GraphQL returns 200 OK even for errors. Errors are in the response body:

{
  "data": null,
  "errors": [
    {
      "message": "User not found",
      "locations": [{ "line": 3, "column": 5 }],
      "path": ["user"]
    }
  ]
}

This is controversial. Some prefer HTTP status codes for errors.

Caching Strategies

Effective caching is critical for performance in both REST and GraphQL, but each requires different strategies.

REST Caching

REST works well with HTTP caching:

GET /users/123
Cache-Control: max-age=3600
ETag: "v1"

CDNs, browser caches, and libraries like React Query handle REST caching well.

GraphQL Caching

GraphQL POST requests are harder to cache by default. Solutions:

Normalized caching with Apollo Client or Relay
Persisted queries that become GET requests
Response caching at the CDN level

Client-Side Caching Strategies

Server-side caching for GraphQL is tricky because POST requests with dynamic queries don’t benefit from URL-based caching. Client-side caching becomes essential.

Apollo Client Cache

Apollo Client 3 uses a normalized in-memory cache with automatic cache updates:

import { ApolloClient, InMemoryCache, makeVar } from '@apollo/client';

// Reactive variables for local state
export const cartItemsVar = makeVar<string[]>([]);

// Configure normalized cache
const cache = new InMemoryCache({
  typePolicies: {
    // Customize field-level read/write
    User: {
      fields: {
        // Automatically merge paginated posts
        posts: {
          keyArgs: false, // Same posts field for all users
          merge(existing = [], incoming, { args }) {
            // Cursor-based pagination merge
            return {
              ...incoming,
              items: [...(existing.items || []), ...incoming.items],
              cursor: incoming.cursor,
            };
          },
        },
        // Real-time: update cache when subscription fires
        notifications: {
          merge(existing, incoming) {
            return incoming; // Replace on each notification
          },
        },
      },
    },
    Query: {
      fields: {
        // Debounce duplicate queries
        searchUsers: {
          keyArgs: ['query'],
          mergeLimit: 1, // Only keep most recent
        },
      },
    },
  },
});

const client = new ApolloClient({ cache, link });

Cache Normalization

By default, Apollo denormalizes responses. For apps with related entities, normalize for consistency:

const cache = new InMemoryCache({
  // Every User and Post stored by their ID
  dataIdFromObject: (object) => {
    switch (object.__typename) {
      case "User":
        return `User:${object.id}`;
      case "Post":
        return `Post:${object.id}`;
      default:
        return object.id;
    }
  },
});

// Now cache handles updates automatically
// If Post with id:123 is updated via mutation,
// all queries showing that post update automatically

Advanced Caching Patterns

Relay Cursor Connections

For paginated data, Relay Connections spec provides standardized pagination:

# Standardized connection pagination
query {
  user(id: "123") {
    postsConnection(first: 10, after: "cursor123") {
      edges {
        node {
          id
          title
        }
        cursor
      }
      pageInfo {
        hasNextPage
        endCursor
      }
    }
  }
}

// Apollo cache understands connection structure
const cache = new InMemoryCache({
  typePolicies: {
    User: {
      fields: {
        postsConnection: {
          // Apollo handles cursor-based pagination automatically
          // when using apollo-utilities
          keyArgs: false,
        },
      },
    },
  },
});

// Pagination query
const { data, fetchMore } = useQuery(POSTS_QUERY, {
  variables: { first: 10 },
});

// Load more
const loadMore = () => {
  return fetchMore({
    variables: {
      after: data.user.postsConnection.pageInfo.endCursor,
    },
    updateQuery: (prev, { fetchMoreResult }) => {
      if (!fetchMoreResult) return prev;

      return {
        user: {
          ...prev.user,
          postsConnection: {
            ...fetchMoreResult.user.postsConnection,
            edges: [
              ...prev.user.postsConnection.edges,
              ...fetchMoreResult.user.postsConnection.edges,
            ],
          },
        },
      };
    },
  });
};

Cache Invalidation Patterns

GraphQL cache invalidation is trickier than REST because the server doesn’t know what’s cached:

// Pattern 1: Cache as source of truth - mutate cache directly
const [createPost] = useMutation(CREATE_POST_MUTATION, {
  update: (cache, { data: { createPost } }) => {
    // Read existing query
    const existing = cache.readQuery({ query: USER_POSTS_QUERY });

    // Write new data to cache
    cache.writeQuery({
      query: USER_POSTS_QUERY,
      data: {
        user: {
          ...existing.user,
          posts: [createPost, ...existing.user.posts],
        },
      },
    });
  },
});

// Pattern 2: Evict and refetch
const [deletePost] = useMutation(DELETE_POST_MUTATION, {
  // Evict specific post from cache
  refetchQueries: [{ query: USER_POSTS_QUERY }],
  // Or evict specific key
  awaitRefetchQueries: true,
});

// Pattern 3: Using @client directive for local-only fields
const cache = new InMemoryCache();
const client = new ApolloClient({ cache });

// Define local-only field
const typeDefs = `
  type User {
    isOnline: Boolean @client
    cartItems: [ID!] @client
  }
`;

// Read/write local state
const { data } = useQuery(GET_USER);
const [setOnline] = useMutation(SET_ONLINE_MUTATION);

// Update local state directly
setOnline({ variables: { isOnline: true } });
// Apollo updates cache, UI reacts automatically

Cache Performance Tips

Pattern	Use When	Benefit
`keyArgs: ['field']`	Multiple similar queries	Cache reused across components
`merge(existing, incoming)`	Pagination	Append to lists correctly
`read` function	Transform data	Format dates, combine fields client-side
`@client` directive	Local-only state	No server roundtrip for UI state
`cache-and-network` fetch policy	Real-time updates	Show cached immediately, update from server

Comparison Table

Aspect	REST	GraphQL
Data fetching	Multiple requests	Single request
Data shape	Fixed per endpoint	Client specifies
Typing	Documentation	Schema enforced
Caching	HTTP caching	Custom caching
Learning curve	Lower	Higher
Tooling	Mature	Evolving
Error handling	HTTP status codes	200 + error body
Overfetching	Common	Avoided

Combining Both

You do not have to choose one exclusively. Some teams use REST for simple operations and GraphQL for complex data requirements.

# REST for simple operations
GET /health
GET /config

# GraphQL for complex data
POST /graphql

Schema Federation

As GraphQL usage scales across multiple teams, a single monolithic schema becomes unwieldy. Schema Federation and Schema Stitching are approaches for composing distributed GraphQL schemas.

Schema Federation (Apollo)

Federation decomposes a schema into independent subgraphs that can be developed and deployed separately:

graph TB
    Client[GraphQL Client] --> Gateway[Apollo Gateway]
    Gateway --> Users[Users Subgraph]
    Gateway --> Orders[Orders Subgraph]
    Gateway --> Products[Products Subgraph]
    Users --> UsersDB[(Users DB)]
    Orders --> OrdersDB[(Orders DB)]
    Products --> ProductsDB[(Products DB)]

# products subgraph - src/subgraphs/products.ts
import { gql, Subgraph } from '@apollo/subgraph';

export const productsSubgraph: Subgraph = {
  name: 'products',
  typeDefs: gql`
    type Product @key(fields: "id") {
      id: ID!
      name: String!
      price: Float!
      reviews: [Review!]!
    }

    type Review {
      id: ID!
      rating: Int!
      comment: String!
    }

    extend type Query {
      product(id: ID!): Product
      productsOnSale: [Product!]!
    }
  `,
  resolvers: {
    Product: {
      // Resolve reviews from external subgraph
      __resolveReference: (product) => {
        return { __typename: 'Product', id: product.id };
      },
    },
    Query: {
      product: (_, { id }) => getProductById(id),
      productsOnSale: () => getProductsOnSale(),
    },
  },
};

# users subgraph extends Product with reviews
# src/subgraphs/users.ts
import { gql, Subgraph } from '@apollo/subgraph';

export const usersSubgraph: Subgraph = {
  name: 'users',
  typeDefs: gql`
    extend type Product @key(fields: "id") {
      id: ID! @external
      reviews: [Review!]! # Defined in products, extended here
    }

    type Review @key(fields: "id") {
      id: ID! @external
      author: User!
    }

    type User {
      id: ID!
      name: String!
      reviews: [Review!]!
    }
  `,
};

Schema Stitching (Legacy)

Stitching merges schemas at the boundary level, often with custom logic:

import { mergeSchemas } from "@graphql-tools/schema";
import { makeExecutableSchema } from "@graphql-tools/schema";

const usersSchema = makeExecutableSchema({
  typeDefs: usersTypeDefs,
  resolvers: usersResolvers,
});

const ordersSchema = makeExecutableSchema({
  typeDefs: ordersTypeDefs,
  resolvers: ordersResolvers,
});

// Merge schemas
const stitchedSchema = mergeSchemas({
  schemas: [usersSchema, ordersSchema],

  // Custom resolvers for cross-schema references
  resolvers: (mergeInfo) => ({
    User: {
      orders: {
        resolve(user, args, context, info) {
          return mergeInfo.delegateToSchema({
            schema: ordersSchema,
            operation: "query",
            fieldName: "ordersByUser",
            args: { userId: user.id },
            info,
          });
        },
      },
    },
  }),
});

Federation vs Stitching

Aspect	Federation	Stitching
Architecture	Gateway + autonomous subgraphs	Centralized merged schema
Schema ownership	Teams own their types	Central team owns merged schema
Deployment	Independent subgraph deploys	Full redeploy on changes
Query planning	Gateway routes to subgraphs	Stitching layer plans queries
@key directive	Entities shared across subgraphs	Type merging for shared types
Production maturity	Battle-tested at scale (Netflix, Airbnb)	More complex, less common now

When to Use Federation

Federation shines when:

Multiple teams own different domain areas
Services are already microservices
Independent deployment is important
You want clear ownership boundaries

# Example: User service owns User type, can reference Product
type User {
  id: ID!
  name: String!
  # Products this user has purchased - reference to Product entity
  purchasedProducts: [Product!]! @requires(fields: "id")
}

Hybrid Pattern: REST + GraphQL Federation

You can federate REST APIs alongside GraphQL subgraphs:

import { ApolloGateway } from "@apollo/gateway";
import { RemoteGraphQLDataSource } from "@apollo/gateway";

const gateway = new ApolloGateway({
  serviceList: [
    { name: "users", url: "http://users-service/graphql" },
    { name: "products", url: "http://products-service/graphql" },
    // REST API exposed as GraphQL via REST Data Source
    { name: "inventory", url: "http://inventory-service/graphql" },
  ],

  // Custom data source for REST
  dataSources: () => ({
    inventory: new RestDataSource("http://inventory-service/api"),
  }),
});

When to Choose REST vs GraphQL

When to Choose REST

REST works well when:

Your API is simple with predictable data requirements
You need HTTP caching
You are building public APIs consumed by many clients
Your team is familiar with REST
You need simple documentation (just list endpoints)

Examples: CRUD applications, simple CRUD APIs, public APIs for third-party developers

When to Choose GraphQL

GraphQL works well when:

Clients need different data shapes
Mobile apps needing minimal data transfer
Complex domains with many related entities
Rapid iteration with frontend teams
You want strong typing and schema validation

Examples: Mobile apps, complex dashboards, microservices with varying client needs

Trade-off Analysis

Decision Framework

Factor	REST	GraphQL
Data Fetching	Multiple endpoints; over/underfetching common	Single endpoint; exact data shapes
Caching	Native HTTP caching; CDN-friendly	Custom client-side caching; persisted queries for CDN
Type Safety	No enforced schema; OpenAPI optional	Strongly typed schema; self-documenting
Learning Curve	Simpler for beginners; familiar HTTP concepts	Steeper; requires understanding of queries and resolvers
Tooling	Mature ecosystem; standard HTTP debugging	GraphQL-specific IDEs; introspection for discovery
Performance	Predictable; caching reduces load	N+1 risk without DataLoader; complex query optimization
Real-time	Polling or webhooks; not native	Native subscriptions via WebSockets
Security	Endpoint-based rate limiting; familiar patterns	Query complexity limits; introspection control
Schema Evolution	Versioned APIs; breaking changes managed	Additive changes; @deprecated directive
Team Ownership	Endpoint-per-team; clear boundaries	Schema ownership; federation for scaling

Coexistence Pattern

Many organizations use both. A pragmatic approach:

// api/index.js
import { createServer } from "./api-server";

// REST for simple, stable resources
app.use("/api/users", usersRestRouter);
app.use("/api/health", healthCheckRouter);

// GraphQL for complex, client-driven data
app.use("/api/graphql", graphqlMiddleware);

// Bridge: REST endpoints call GraphQL resolvers internally
app.get("/api/users/:id/summary", async (req, res) => {
  const result = await graphql.execute({
    query: `query UserSummary($id: ID!) {
      user(id: $id) { name email role }
    }`,
    variables: { id: req.params.id },
  });
  res.json(result.data);
});

When to Migrate

Signal	Action
Mobile app requests 15+ REST calls per screen	Consider GraphQL
Frontend team blocked by backend endpoint pace	GraphQL gives frontend autonomy
Complex data graphs with varying client needs	GraphQL shines here
Heavy HTTP caching requirements	REST likely sufficient
Simple CRUD with predictable data shapes	REST is probably fine
Third-party public API	REST for stability and caching

Production Failure Scenarios

Failure	Impact	Mitigation
Query complexity explosion	Server overwhelmed; possible DoS	Implement query depth limiting; set complexity budgets
N+1 query problem	Database flooded with queries; slow responses	Use DataLoader for batching; optimize resolvers
Schema introspection abuse	Information leakage; enumeration attacks	Disable introspection in production; restrict access
Subscription memory leaks	Server memory grows; eventual crash	Set subscription limits; implement timeouts
Mutation race conditions	Data inconsistency between related operations	Implement optimistic locking; use transactions
Persisted query abuse	Attackers pre-store malicious queries	Validate persisted query hashes; rate limit
Error masking with 200 OK	Errors hidden; debugging difficult	Return proper error status codes; use extensions

Observability Checklist

Metrics

Query rate by operation type (query/mutation/subscription)
Query complexity distribution
Request duration by operation and field
Error rate by error type
DataLoader batch efficiency (cache hit ratio)
Schema introspection requests
Subscription active count
Query depth and breadth distribution

Logs

Query requests with variables and complexity
Mutation requests with authorization context
DataLoader batch operations and cache misses
Error responses with path and locations
Schema change events
Subscription lifecycle events
Security events (introspection attempts, rate limit hits)

Alerts

Query complexity exceeds threshold
Error rate exceeds normal baseline
DataLoader cache hit ratio drops below 80%
Subscription count exceeds limits
Unusual introspection activity
Query depth spikes indicate potential attack

Security Checklist

Disable schema introspection in production
Implement query complexity limits and depth limits
Use persisted queries to prevent abuse
Validate and sanitize all variable inputs
Implement proper authorization at resolver level
Log and monitor unusual query patterns
Rate limit queries per client
Protect against query batching attacks (array literals)
Use query whitelisting for sensitive operations
Validate that mutations affect only intended fields
Implement request timeout at GraphQL layer
Do not expose internal error details in responses

Common Pitfalls / Anti-Patterns

Overusing GraphQL for Simple Cases

GraphQL adds complexity. Simple REST endpoints may be better.

# Overkill: GraphQL for simple, predictable data
query {
  healthCheck {
    status
  }
}

# Better: Simple REST endpoint
GET /health

Ignoring N+1 Queries

GraphQL makes N+1 problems easy to create.

# Problem: Fetches posts for each user separately
query {
  users {
    name
    posts {
      title
    } # N queries for N users
  }
}

# Better: Use DataLoader to batch
query {
  users {
    name
    posts {
      title
    } # Single batched query
  }
}

Not Implementing Proper Error Handling

GraphQL returns 200 OK even for errors.

// Problem: Error masked as success
{
  "data": { "user": null },
  "errors": [{ "message": "Not authorized" }]
}

// Better: Use proper HTTP status codes
{
  "errors": [{
    "extensions": { "code": "UNAUTHORIZED" }
  }]
}

Exposing Schema Internals

Introspection can reveal your entire schema.

// Disable introspection in production
const server = new ApolloServer({
  schema,
  introspection: false, // Production
  playground: false, // Production
});

Interview Questions

Fundamentals

1. What is the N+1 problem in GraphQL, and how does DataLoader solve it?

The N+1 problem occurs when fetching a list of items triggers separate requests for each item's related data. For example, querying 100 users and their posts makes 1 query for users plus 100 queries for posts.

DataLoader solves this by:

Batching: Collecting all requested IDs during query execution, then loading them in a single database query
Caching: Memoizing results within a request to avoid duplicate loads
Queuing: Using the loader in resolvers, which automatically coalesces requests

2. Explain the difference between mutations and queries in GraphQL. Why are mutations typically named with verb patterns (createUser, updatePost) while queries use noun patterns (user, posts)?

Queries are idempotent reads—they don't modify data. Mutations are operations that cause side effects and modify server state.

Naming conventions:

Queries as nouns (user, posts) — you "ask for" data
Mutations as verbs (createUser, updatePost) — you "command" an action

This convention makes it immediately clear in tooling and documentation which operations modify state.

3. Why does GraphQL return HTTP 200 OK even for errors? What are the trade-offs?

The reasoning is that HTTP and GraphQL live on separate layers. A query might partially succeed—some fields resolve, others don't. Returning 200 with both data and errors preserves that nuance.

It trips up traditional monitoring, though. Your HTTP-aware tools expect non-2xx for failures, not a 200 with an error body. And authorization errors masquerading as successes is genuinely unsettling.

Best practice: stick error codes in extensions so clients can actually detect failures.

Design & Architecture

4. How would you implement real-time updates in GraphQL? Compare subscriptions vs polling vs webhooks.

GraphQL Subscriptions use WebSockets for bidirectional communication. The server pushes data when relevant events occur.

Comparison:

Subscriptions: Instant push, long-lived connections, best for UI updates
Polling: Simple, HTTP-based, predictable load—good for infrequent updates
Webhooks: HTTP callbacks from server to client, standard for cross-service events

Most systems use all three for different use cases.

5. What is schema stitching and how does it differ from schema federation?

Schema stitching merges multiple schemas into one at a gateway layer. The gateway acts as a unified API surface.

Federation (Apollo) decomposes schema ownership—each subgraph owns its types and the gateway routes queries to the right subgraph. Federation is more scalable and team-friendly.

Stitching is older, more complex to maintain, and less commonly used in new projects.

6. Explain how persisted queries improve GraphQL security and performance.

Persisted queries replace full query strings with SHA-256 hashes sent from client to server.

Security: Only pre-registered queries execute—prevents query injection attacks and introspection abuse.

Performance: Queries are pre-validated at build time—no parsing or validation overhead at runtime. Network payload shrinks from full query to 64-character hash.

7. How does normalized caching in Apollo Client work, and when would you use it?

Apollo's normalized cache stores entities by their @id type, so updating a User:id:123 automatically updates all queries that reference that user. This prevents cache inconsistency.

Use normalized cache when:

Multiple components show the same entity
Mutations update entities that appear in cached queries
You want automatic cache updates without manual refetchQueries

8. Describe the trade-offs between GraphQL and REST for a public API consumed by many third-party developers.

REST wins for public APIs because HTTP caching is built in, debugging is straightforward, and consumers can rate-limit per endpoint. GraphQL gives clients flexibility to fetch exactly what they need with a self-documenting schema, but that flexibility comes at a cost: no standard caching, steeper learning curve, harder to debug.

For diverse third-party consumers, REST is usually the safer bet. For internal tools or complex client-driven UIs, GraphQL pays off.

Advanced & Production

9. How would you secure a GraphQL API against query complexity attacks?

Stack several approaches:

Query depth limits stop deeply nested attacks
Complexity analysis assigns costs to fields—reject anything too expensive
Persisted queries mean only pre-registered operations run
Rate limiting per client prevents abuse
Timeouts kill queries that run too long
Turn off introspection in production so attackers can't enumerate your schema

10. What strategies would you use for GraphQL caching at the CDN and browser level?

CDN caching breaks with GraphQL because POST bodies aren't URL-keyed. Your options:

Persisted queries as GET requests—hash goes in the URL, CDN can cache
Response caching keyed by operation name plus variables hash
Some CDNs (Cloudflare, for example) can fingerprint GraphQL requests and cache accordingly

On the browser, Apollo or Relay handle client-side caching. Service workers add offline support. Normalized cache keeps shared entity data consistent across components.

11. How does DataLoader handle cache misses and partial failures in batches?

DataLoader queues up requests during execution, then fires one batched query when the loader runs. Each key either hits cache or misses and gets batch-loaded. Results map back to the original key order.

Partial failures are interesting—the batch function can return Error objects alongside successful results. DataLoader propagates the Error to whichever field requested the failed key, while everything else succeeds. Nice for granular error handling.

12. Explain when you would choose cursor-based pagination vs offset-based pagination in GraphQL.

Offset pagination (skip/limit) is dead simple but doesn't scale. Page N means scanning N times the page size. Add a row and everything shifts.

Cursors use stable markers. Page 2 after page 1 stays consistent even with concurrent writes. This is why infinite scroll and social feeds use cursors.

Use offset for small admin UIs where jumping to page 5 directly is valuable. Use cursors for anything user-facing with large datasets.

13. What are the key considerations when migrating from REST to GraphQL in a large organization?

A few things trip people up:

Teams need time to internalize GraphQL thinking—it's not just another REST
Your existing tooling (CI/CD, monitoring, API gateways) probably needs updates
Don't rewrite everything at once—run GraphQL alongside REST and migrate domain by domain
Schema ownership gets thorny fast: who can modify what? Set clear boundaries early
N+1 problems will surface. DataLoader isn't optional.
Security surface is different—introspection attacks, query complexity, different rate-limiting vectors

14. How would you handle file uploads in a GraphQL API?

GraphQL spec says nothing about files. You've got options:

Multipart requests via graphql-upload library
Base64 encoding (terrible—bloats the payload)
Pre-signed URLs (S3, GCS, whatever)—upload goes direct, you just pass the URL to GraphQL
REST for uploads, GraphQL for the mutation that references the result

Pre-signed URLs scale best. The heavy lifting happens outside your server.

15. Describe how you would implement authorization at the field level in GraphQL.

Do it in resolvers:

const resolvers = {
  User: {
    email: (user, args, context) => {
      if (context.user.id !== user.id && !context.user.isAdmin) {
        return null;
      }
      return user.email;
    },
  },
};

Always check at resolver level, never trust the client to filter. Return null for unauthorized fields instead of throwing—it gives you partial data rather than killing the whole query.

16. What is the relationship between GraphQL subscriptions and WebSockets? How do you handle subscription scaling in a distributed environment?

Subscriptions are built on WebSockets for bidirectional, long-lived connections. The client opens a WebSocket, sends a subscription query, and the server pushes updates when events occur.

For distributed scaling (multiple GraphQL servers):

Use Redis Pub/Sub or RabbitMQ to broadcast events across server instances
Each server subscribes to the message broker and forwards relevant events to its connected clients
Connection state (subscribed channels per client) must be tracked and cleaned up on disconnect

17. How does GraphQL handle authentication and authorization at the API level?

GraphQL sits on top of HTTP, so authentication is typically HTTP-based:

JWT tokens in Authorization header—validated in middleware before GraphQL execution
Cookies for web apps; tokens for mobile/native clients
Authorization happens in resolvers—check context.user permissions per field

Field-level authorization: return null for unauthorized fields rather than throwing, so partial data still returns.

18. What are the advantages and disadvantages of schema-first vs code-first GraphQL development?

Schema-first: you write SDL first, then generate resolvers. Pros: clear contract, easy review, self-documenting. Cons: resolver code may drift from schema.

Code-first: you define types in code (using decorators, classes, or builder patterns) and schema is generated from them. Pros: type safety in your language, less duplication. Cons: schema is derived, not explicit.

Most teams end up with a hybrid—they write SDL for complex types but generate parts programmatically.

19. Explain the purpose and behavior of the `@deprecated` directive in GraphQL. How does it support schema evolution?

The @deprecated directive marks fields or enum values as obsolete:

type User {
  name: String!
  fullName: String @deprecated(reason: "Use 'name' instead")
  role: UserRole!
}

Clients introspection can show deprecated fields with warnings. This lets you:

Remove fields gradually—old clients see warnings, new clients use the replacement
Communicate breaking changes without immediately cutting off old clients
Keep schema backward compatible while guiding migration

20. How would you approach testing a GraphQL API? What tools and strategies do you use?

Test at multiple layers:

Unit tests: resolver logic with mock context
Integration tests: execute full queries against a test database
Schema tests: validate that schema transformations produce expected output

Tools:

graphql-testing-library for integration tests
@graphql-tools/mock for schema mocking
Apollo Server test mode for end-to-end execution
Load testing with k6 or artillery for performance

Conclusion

REST and GraphQL make different trade-offs. REST is resource-oriented with fixed endpoints and built-in HTTP caching. GraphQL is query-oriented with flexible data fetching and strong typing.

REST works well for simple, predictable APIs where caching matters. GraphQL works well for complex, client-driven data requirements where you want to avoid overfetching. Neither is universally better.

For REST API design, see the RESTful API Design post. For versioning both types of APIs, see the API Versioning Strategies post.