Elasticsearch: Full-Text Search at Scale
Learn how Elasticsearch powers search at scale with inverted indexes, sharding, replicas, and its powerful Query DSL for modern applications.
Elasticsearch: Full-Text Search at Scale
Elasticsearch stores data in inverted indexes — the key data structure that makes full-text search fast across billions of documents. This post covers how it works: inverted indexes, shards, replicas, and the Query DSL for writing searches that actually perform.
Core Concepts
The Inverted Index
Elasticsearch stores data in Lucene indices. The core data structure is the inverted index — a mapping from terms to the documents containing those terms. When you search for a word, the engine looks it up and retrieves matching documents immediately. No full table scan. This is what makes Elasticsearch fast.
{
"inverted_index": {
"elasticsearch": [{ "doc_id": 1, "positions": [0, 3] }],
"tutorial": [{ "doc_id": 1, "positions": [1] }],
"full-text": [{ "doc_id": 2, "positions": [0] }]
}
}
When you search for “elasticsearch tutorial,” Elasticsearch looks up both terms in the inverted index and finds matching documents instantly. No full table scan required.
The index also stores metadata about each term: document frequency, term frequency, positions for phrase queries, and norms for field-length normalization. This metadata is what makes relevance scoring, phrase matching, and fuzzy search possible.
Analyzer Pipeline
Before terms enter the inverted index, they pass through an analyzer consisting of three stages:
- Character filters remove HTML tags, convert characters, or apply language-specific normalizations.
- Tokenizer splits text into individual terms (tokens).
- Token filters lowercase terms, remove stop words, apply synonyms, and perform stemming.
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "snowball", "asciifolding"]
}
}
}
}
}
Choosing the right analyzer matters. The standard analyzer works fine for most English text. For domain-specific vocabularies, you might need custom analyzers with synonym filters or language-specific stemmers.
Sharding: Distributing Data Across Nodes
A single Elasticsearch node can store millions of documents, but eventually you will need more storage, CPU, or memory than one machine provides. Elasticsearch solves this with shards: horizontal partitions of your index.
When you create an index, you specify the number of primary shards. Each primary shard is an independent Lucene index that stores a subset of your documents.
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
}
}
With three primary shards, Elasticsearch distributes documents roughly evenly across them. A document with ID doc123 is routed to a specific shard using a hash of the ID: shard = hash(_id) % num_primary_shards.
Shard Routing Explained
graph TD
Client[Client] -->|"search request"| LB[Load Balancer]
LB --> Node1[Node 1]
Node1 -->|"coordination"| Coordinator[Coordinator]
Coordinator -->|"scatter"| Shard1[Primary Shard 1]
Coordinator -->|"scatter"| Shard2[Primary Shard 2]
Coordinator -->|"scatter"| Shard3[Primary Shard 3]
Shard1 -->|"gather"| Coordinator
Shard2 -->|"gather"| Coordinator
Shard3 -->|"gather"| Coordinator
Coordinator -->|"reduce"| Client
The node receiving the search request becomes the coordinator. It broadcasts the query to all relevant shards, collects results, and merges them into a single response. The scatter-gather pattern here is what lets you search across all shards in parallel.
Replica Shards
Every primary shard can have replicas for fault tolerance and read scalability. Replicas are never allocated on the same node as their primary. If a node fails, Elasticsearch promotes replicas to primaries automatically. Add replicas to linearly scale search throughput for read-heavy workloads.
The Query DSL: Expressing Search Logic
Elasticsearch’s Query DSL is a JSON-based language for expressing complex searches. It separates leaf queries (match, term) from compound queries (bool). Use filter context for non-scoring queries to leverage caching. Use query context when relevance scoring matters.
The Bool Query
The bool query is the workhorse of Elasticsearch search. It supports four clauses:
- must: Documents must match (AND logic)
- should: Documents should match (OR logic)
- filter: Same as must but without scoring (faster)
- must_not: Documents must not match (NOT logic)
{
"query": {
"bool": {
"must": [{ "match": { "title": "elasticsearch" } }],
"filter": [
{ "range": { "publish_date": { "gte": "2024-01-01" } } },
{ "term": { "status": "published" } }
],
"should": [
{ "match": { "tags": "search" } },
{ "match": { "tags": "database" } }
],
"minimum_should_match": 1
}
}
}
The filter context is particularly important. Queries in filter context bypass scoring entirely, and Elasticsearch caches filter bitsets for reuse. If you filter by a static field like status, subsequent queries become faster.
Relevance Scoring
Elasticsearch uses BM25 (Okapi BM25) as its default similarity algorithm. BM25 considers term frequency saturation and field length normalization. A term appearing 10 times in a 100-word field scores roughly the same as appearing 5 times in a 50-word field.
You can debug relevance with the explain parameter:
GET /my-index/_search
{
"explain": true,
"query": {
"match": { "content": "elasticsearch tutorial" }
}
}
The response shows exactly how each document scored, including term frequencies, inverse document frequencies, and field norms.
Designing Indices for Performance
Index design profoundly impacts search performance. A few principles:
Size your shards wisely. Shards between 10GB and 50GB work well. Too many small shards cause overhead; too few large shards cause memory pressure.
Use aliases for flexibility. Index aliases let you reindex without downtime:
POST /_aliases
{
"actions": [
{ "remove": { "index": "my-index-v1", "alias": "my-index" } },
{ "add": { "index": "my-index-v2", "alias": "my-index" } }
]
}
Denormalize for read performance. Elasticsearch does not support joins like SQL. If you frequently query documents with nested objects, consider flattening the structure or using denormalization.
Capacity Estimation
Sizing an Elasticsearch cluster means estimating heap and document counts.
Heap allocation per node:
heap_per_node ≈
(shards_per_node × segment_overhead)
+ (indexing_buffer × 10-20% of heap)
+ (query_cache × 10% of heap)
+ (fielddata × 10-20% of heap)
+ OS reserve (1GB)
Elasticsearch heap is shared across all shards on a node. Segment overhead alone is roughly 25MB per shard. With 30 shards on one node, that is 750MB before you even touch indexing buffers or caches.
Docs per shard guidelines:
| Shard Size Target | Docs per Shard (approx) |
|---|---|
| 10GB shard | 10-30M docs (variable by doc size) |
| 30GB shard | 30-100M docs |
| 50GB shard | 50-150M docs |
Doc count is harder to predict than shard size. A 1KB doc and a 50KB doc land at very different doc counts even at the same shard size. Watch both.
Example: 500GB index with 3 primary shards and 1 replica each:
shards_per_node = 3 primaries + 3 replicas = 6 shards
at 30GB/shard = 180GB per node
heap needed:
segment overhead: 6 × 25MB = 150MB
indexing buffer: 512MB (default, scales with heap)
query cache: 276MB (10% of ~2.7GB heap)
fielddata: 276MB
OS reserve: 1GB
total ≈ 2.5GB minimum, 4GB recommended
The sweet spot is 30GB-32GB heap per node. Above 32GB, Lucene starts using direct memory buffers outside the Java heap, which shifts the memory math. Most production clusters run 31GB heap on 64GB machines — the remaining RAM goes to the OS page cache, which Lucene uses heavily for file system caching.
Under the Hood: Segments, Translog, and Merge Policies
Elasticsearch is built on Apache Lucene, and understanding how Lucene stores your data at the file level demystifies many production behaviors — sudden latency spikes, disk usage patterns, and heap growth all trace back to segment internals.
Lucene Segments
Every shard contains one or more Lucene segments. A segment is a self-contained, immutable inverted index. When you index a document, it goes into a small in-memory buffer. On refresh, that buffer is flushed into a new segment file on disk. Searches hit all segments and merge results.
Segment immutability means writes never modify existing segments — new documents create new segments. This design enables locking-free writes but creates a problem: many small segments degrade search performance.
Segment Lifecycle
index flush merge
doc -> buffer -> segment -> (merged) -> larger segment
\ /
\----------------/
translog backup
The refresh_interval controls how often in-memory buffers become searchable (default: 1 second). Between flushes, data lives in both the buffer and the translog.
The Translog
The translog is a write-ahead log that durability-safes every indexed document before it is committed to a segment. If a crash occurs, Elasticsearch replays the translog to recover un-flushed documents. This is the key difference between Elasticsearch’s near-real-time promise and true real-time guarantees.
Translog truncation happens automatically after a successful flush. You can force a translog flush:
POST /my-index/_flush
For write-heavy workloads, sizing the translog matters:
{
"index": {
"translog": {
"disable_threshold": "1h",
"sync_interval": "5s",
"retention": {
"size": "512MB",
"age": "12h"
}
}
}
}
Setting retention.size too low causes data loss on crashes. Setting it too high wastes disk space. The right value depends on your crash-recovery SLA.
Merge Policies
When segments accumulate, Lucene runs a merge policy to coalesce small segments into larger ones. The default policy is TieredMergePolicy (Lucene 9+) or LogByteSizeMergePolicy (older).
TieredMergePolicy targets a balanced segment size ladder:
{
"index": {
"merge_policy": {
"type": "tiered",
"max_merge_at_once": 10,
"segments_per_tier": 10,
"max_merged_segment_size_bytes": "5GB"
}
}
}
Key parameters:
max_merge_at_once— how many segments can be merged in a single merge round (default: 10)segments_per_tier— how many small segments are allowed before a merge is triggered (default: 10)max_merged_segment_size_bytes— caps the size of merged segments to prevent giant segments
LogByteSizeMergePolicy (simpler, older):
{
"index": {
"merge_policy": {
"type": "log_byte_size",
"min_merge_size": "2MB",
"max_merge_size": "10GB"
}
}
}
For write-heavy workloads (like log ingestion), set max_merge_at_once lower so merges don’t steal CPU from indexing. For read-heavy workloads, higher values mean fewer but larger segments — faster searches.
Force Merge
Running a force merge reduces segment count dramatically, at the cost of heavy I/O:
POST /my-index/_forcemerge?max_num_segments=1
This collapses all segments into one. It dramatically speeds up searches on read-heavy indices and reduces file handle usage. On large indices, run it during maintenance windows — it is I/O and CPU intensive, and Elasticsearch blocks new searches against indices being force-merged.
When to Use / When Not to Use
When to Use Elasticsearch
- Log and event analysis at scale, especially with the ELK stack
- Full-text search with complex relevance tuning and fuzzy matching
- Real-time analytics on time-series data with aggregations
- Distributed search where horizontal scalability is a requirement
- Autocomplete and type-ahead features via completion suggester
When Not to Use Elasticsearch
- Primary data store requiring ACID transactions (use a proper database)
- Simple key-value lookups where a document store suffices
- Heavy join operations across multiple entity types
- Systems requiring strong consistency (Elasticsearch is eventually consistent by default)
- Small datasets where a single PostgreSQL
tsvectoror SQLite FTS5 would suffice
Trade-off Analysis
Elasticsearch is not the only game in town. Depending on your use case, an alternative may serve you better.
Elasticsearch vs Apache Solr
| Factor | Elasticsearch | Apache Solr |
|---|---|---|
| Ecosystem | ELK Stack (Kibana, etc.) | Solr + SolrCloud, Banana (older) |
| Operational model | Born for distributed | Distributed via SolrCloud |
| Schema | Dynamic mappings | Strict schema (SolrCell for parsing) |
| Query DSL | JSON-based, developer-friendly | XML-based (older), JSON supported |
| Scaling | Horizontal by default | Horizontal with auto-replication |
| Security | XPack (paid), builtin | Apache Shiro-based, configurable |
| Learning curve | Lower | Higher (deeper Lucene knowledge needed) |
| Use case fit | Logs, metrics, search | Enterprise search, faceted catalogs |
Solr has a deeper history in enterprise search and handles complex faceted search (like e-commerce product catalogs) beautifully. Elasticsearch wins on operational simplicity, the Kibana ecosystem, and native horizontal scaling.
Elasticsearch vs OpenSearch
| Factor | Elasticsearch | OpenSearch |
|---|---|---|
| Licensing | SSPL (since 2021) | Apache 2.0 |
| Governance | Elastic N.V. | Linux Foundation (community-driven) |
| Feature parity | Faster innovation cycles | Elasticsearch features, ported over |
| Plugins | Commercial XPack | OpenSearch plugins |
| Security | XPack Security (paid) | OpenSearch Security (free) |
| Use case fit | Proprietary stack | If you need Apache-licensed search |
If licensing concerns (SSPL) are a blocker, OpenSearch is the most mature drop-in replacement with near-complete API compatibility.
Elasticsearch vs Meilisearch
| Factor | Elasticsearch | Meilisearch |
|---|---|---|
| Complexity | Full cluster management | Single binary, runs anywhere |
| Relevance tuning | Deep BM25 + scoring | Basic relevance, typo tolerance |
| Indexing speed | Very fast at scale | Extremely fast, even at small scale |
| Memory footprint | Several GB (JVM heap) | < 100MB |
| Distributed | Yes (native sharding) | Limited (single replica only) |
| Use case fit | Billions of docs, complex queries | Small-medium datasets, typo-tolerant search, fast prototype |
| Configuration | Expert-level tuning | Works out of the box |
Meilisearch is the right choice when you want Elasticsearch-grade relevance without the operational overhead. It excels at typo-tolerant search, autocomplete, and datasets under 100 million documents.
Elasticsearch vs Typesense
| Factor | Elasticsearch | Typesense |
|---|---|---|
| License | SSPL | GPL 3 (self-hosted), cloud-only (hosted) |
| Memory | JVM heap-heavy | ~100MB-1GB |
| Query model | Full Query DSL | Simple, typo tolerance built-in |
| Distributed | Native sharding | Raft-based clustering |
| Faceting | Rich aggregations | Limited (planned) |
| Use case fit | Complex enterprise search | Typo-tolerant search, autocomplete, ranking |
| Dev experience | Steep learning curve | Quick start, friendly API |
Typesense is optimized for developer experience and typo-tolerant search. If you need sub-100ms search with minimal DevOps and can live with simpler faceting, Typesense wins on simplicity.
Decision Matrix
| Your need | Best choice |
|---|---|
| Billions of logs + Kibana dashboards | Elasticsearch |
| Enterprise faceted catalog search | Apache Solr |
| License-free search engine | OpenSearch |
| Fast prototype, small team, typo tolerance | Meilisearch |
| Autocomplete, typo-tolerant, low-ops | Typesense |
| Deep relevance tuning, complex queries | Elasticsearch |
| ACID transactions with search | PostgreSQL + tsvector |
Production Failure Scenarios
| Failure | Impact | Mitigation |
|---|---|---|
| Primary shard allocation fails | Index unavailable for writes | Set index.number_of_replicas >= 1 and use automatic retry |
| Coordinating node OOM | Search queries fail across cluster | Limit search.max_buckets, add circuit breakers, increase heap |
| Split-brain during network partition | Duplicate data or conflicting primaries | Use minimum_master_nodes ( quorum), prefer single-zone clusters |
| Bulk indexing queue overflow | Documents rejected, indexing lag | Size queue with thread_pool.write.queue_size, implement backpressure |
| Hot/Warm node imbalance | Some nodes run out of disk or CPU | Use Index Lifecycle Management (ILM), allocate shards manually |
| Incorrect analyzer causing data loss | Documents return no results | Test analyzers with _analyze API before applying to production indices |
Common Pitfalls / Anti-Patterns
Over-Sharding
Creating too many shards wastes memory and increases cluster metadata overhead. Each shard maintains its own segment files, segment metadata, and caches. A rule of thumb: keep shard size between 10GB and 50GB. If you have 1TB of data, five 200GB shards is better than fifty 20GB shards.
Fix: Plan shard count based on expected data volume. Use index templates with ILM to auto-delete old indices.
Using Query Context for Filters
Queries in must context score every document, which is expensive when you only need filtering. Move static filters to filter context.
Before (slow):
{
"query": {
"bool": {
"must": [
{ "match": { "content": "search term" } },
{ "term": { "status": "published" } }
]
}
}
}
After (faster):
{
"query": {
"bool": {
"must": [{ "match": { "content": "search term" } }],
"filter": [{ "term": { "status": "published" } }]
}
}
}
Ignoring Refresh Interval
New documents are not searchable until the next refresh (default 1 second). For bulk indexing, this is fine, but for near-real-time requirements, understand the tradeoff. Setting refresh_interval: -1 disables auto-refresh and dramatically speeds up bulk ingestion.
Fix: Adjust refresh_interval based on write vs. read latency requirements.
Not Using Aliases for Zero-Downtime Reindexing
Reindexing directly into an existing index causes downtime and potential data inconsistency.
Fix: Use index aliases with swap operations:
POST /_aliases
{
"actions": [
{ "remove": { "index": "my-index-v1", "alias": "my-index" } },
{ "add": { "index": "my-index-v2", "alias": "my-index" } }
]
}
Quick Recap
Key Bullets
- Elasticsearch stores data in inverted indexes, enabling millisecond full-text queries
- Shard count is fixed at index creation — plan wisely (10GB-50GB per shard target)
- Use
filtercontext for non-scoring queries to leverage caching - Replicas provide fault tolerance and read scaling; they do not help write throughput
- The Query DSL separates leaf queries (match, term) from compound queries (bool)
- Index aliases enable zero-downtime reindexing and blue-green deployments
- BM25 is the default similarity algorithm; it handles term frequency saturation
Copy/Paste Checklist
# Check cluster health
GET /_cluster/health
# View shard allocation
GET /_cat/shards?v
# Monitor search latency
GET /_nodes/stats/indices/search?filter_path=**.*.query_total&timeout=30s
# Force merge to reduce segments
POST /my-index/_forcemerge?max_num_segments=1
# Update replica count
PUT /my-index/_settings
{
"number_of_replicas": 2
}
# Check slow logs
GET /_nodes/stats/indices/indexing?filter_path=**.*.indexing.index.failed
# Set ILM policy
PUT /_ilm/policy/my-policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": { "max_age": "7d" }
}
}
}
}
}
Observability Checklist
Metrics to Monitor
{
"cluster_metrics": {
"cluster_health": "green/yellow/red status",
"number_of_pending_tasks": "< 100 typically",
"task_duration_avg": "< 1s for search, < 5s for bulk"
},
"node_metrics": {
"heap_used_percent": "< 85% sustained",
"cpu_usage_percent": "< 70% sustained",
"disk_io_read/write": "baseline + anomaly detection",
"open_file_handles": "< 80% of ulimit"
},
"index_metrics": {
"indexing_rate": "documents/second",
"search_latency_p99": "< 500ms for interactive, < 2s for batch",
"refresh_latency": "< 1s per segment",
"segments_count": "< 50 per shard (force merge if higher)"
}
}
Key Logs to Capture
- Error logs:
logs/elasticsearch.log— shard failures, OOM events, circuit breaker trips - Deprecation logs: deprecated API usage, settings scheduled for removal
- Slow search logs: configure
index.search.slowlog.thresholdto capture queries exceeding latency thresholds - Indexing slow logs:
index.indexing.slowlog.thresholdfor bulk insert issues
# Enable slow logs via API (persistent across restarts)
PUT /my-index/_settings
{
"index.search.slowlog.threshold.query.warn": "10s",
"index.search.slowlog.threshold.fetch.warn": "1s",
"index.indexing.slowlog.threshold.index.warn": "10s"
}
Alerts to Configure
| Alert | Condition | Severity |
|---|---|---|
| Cluster health | status == red | Critical |
| Node heap usage | heap_used_percent > 90% for > 5 min | Warning |
| Search latency | search_latency_p99 > 2s | Warning |
| Unassigned shards | unassigned_shards > 0 for > 5 min | Critical |
| Bulk queue rejection | bulk_queue_rejections > 0 | Warning |
| Disk watermark | disk.watermark.low exceeded | Warning |
Security Checklist
- Enable XPack Security (or OpenSearch Security if using OpenSearch distro)
- Use role-based access control (RBAC) — define roles with least-privilege principles
- Encrypt node-to-node communication with TLS certificates
- Restrict JMX/REST endpoints to internal networks; do not expose publicly
- Validate input to prevent injection via query strings and aggregations
- Audit access logs — log all write operations to sensitive indices
- Rotate credentials regularly; use a secrets manager for API keys
- Configure field-level security if certain fields must be hidden from some roles
- Enable audit logging to track unauthorized access attempts
# Example: Create a read-only role for an index
POST /_security/role/read_only_blogs
{
"indices": [
{
"names": ["blogs-*"],
"privileges": ["read"]
}
]
}
Interview Questions
An inverted index maps terms (words) to documents rather than documents to terms. When you search for a word, the database looks up the word in the index and immediately retrieves matching documents — no full table scan. A row-based index (like a B-tree in PostgreSQL) maps rows to disk locations for fast primary-key lookups. Inverted indexes enable full-text search capabilities that row-based indexes cannot efficiently support without scanning every row.
Documents are routed to shards using a consistent hash of the document ID: shard = hash(_id) % num_primary_shards. This formula is deterministic — the same document ID always routes to the same shard as long as the number of primary shards does not change. When a search request arrives, the coordinating node broadcasts to all shards, collects results, merges them, and returns the response.
must — documents must match; contributes to relevance score (AND logic). should — documents should match; contributes to score if they do (OR logic). filter — documents must match but without scoring; takes advantage of filter cache (faster). must_not — documents must not match; excluded from results (NOT logic). Filters bypass scoring entirely, making them faster than queries in must context.
BM25 (Okapi BM25) is Elasticsearch's default similarity algorithm. It improves on TF-IDF by introducing term frequency saturation — a term appearing 10 times does not score 10x a term appearing once. It also normalizes by field length: a term in a 10-word field scores higher than the same term in a 500-word field. BM25's saturation function prevents common words from dominating scores and handles long documents more fairly than naive TF-IDF.
Primary shards store the original data and handle both reads and writes. Replica shards are copies of primary shards, serve read requests for horizontal read scaling, and provide fault tolerance — if a primary fails, a replica is promoted automatically. Replicas never contain more data than their primary and are never allocated on the same node as their primary. Replicas do not improve write throughput; only primary shards handle writes.
The analyzer pipeline has three stages: (1) Character filters — strip HTML tags, normalize Unicode, apply character mappings; (2) Tokenizer — splits text into individual tokens; (3) Token filters — lowercase tokens, remove stop words, apply synonyms, perform stemming. The output of all three stages is what gets stored in the inverted index. Choosing the right analyzer is critical — using the wrong one can make documents unfindable.
The translog is a write-ahead log. Every indexed document is written to the translog before being acknowledged. On a crash, un-flushed documents are recovered by replaying the translog. This is why Elasticsearch guarantees durability even though it is not a true ACID database. Translog segments are truncated after a successful segment flush to disk. You can tune translog.sync_interval and translog.retention based on your crash-recovery SLA.
In filter context, a clause determines whether a document is included or excluded but does not affect the relevance score. Elasticsearch caches the resulting bitset for reuse — subsequent identical filters are near-instant. In query context, clauses affect the relevance score; every matching document is scored. Always use filter context for non-scoring criteria like date ranges, status fields, or numeric filters to leverage caching and skip scoring.
Lucene writes new documents into small immutable segments. As segments accumulate, search performance degrades. A merge policy (e.g., TieredMergePolicy) periodically coalesces small segments into larger ones. Key parameters include max_merge_at_once (segments merged per round), segments_per_tier (small segments allowed before merge), and max_merged_segment_size_bytes (cap on merged segment size). Tuning these matters for write-heavy vs. read-heavy workloads — too aggressive merging hurts indexing throughput.
The sweet spot is 30–32GB heap per node. Below 30GB, Lucene uses heap memory for file system caches, which hurts I/O. Above 32GB, JVM pointer compression breaks down and Lucene may start using direct memory buffers outside the heap, making memory accounting less predictable. A typical production node: 64GB machine with 31GB heap, 1GB OS reserve, remaining RAM for the OS page cache (Lucene uses this heavily for file system caching). Always set -Xms equal to -Xmx to prevent heap resizing.
Index aliases enable zero-downtime reindexing. During a reindex from my-index-v1 to my-index-v2, you atomically swap the alias my-index from the old index to the new one. Applications continue querying my-index without knowing which version is active. Aliases also support blue-green deployments, traffic routing to specific shards via routing rules, and querying across multiple indices with a single name.
Split-brain occurs during network partitions when nodes cannot communicate but both sides still accept writes. Elasticsearch prevents this using the quorum formula: (primary_shards + replica_shards) / 2 + 1. Only nodes with a quorum can elect a master. Set discovery.zen.minimum_master_nodes to a value that ensures a quorum cannot exist on both sides of a partition. In modern Elasticsearch (7+), the default discovery avoids split-brain by requiring a majority for master election.
Text fields are analyzed — they break text into tokens (e.g., "Hello World" becomes ["hello", "world"]). You cannot sort or aggregate on analyzed text fields directly because the tokenized values lose the original order. Use keyword fields (or a multi-field with a keyword sub-field) for sorting, faceting, and aggregations. For sortable full-text, use a multi-field mapping with both text (for search) and keyword (for sorting). Example: "name": { "type": "text", "fields": { "keyword": { "type": "keyword" } } }.
term queries look up an exact value in the inverted index — no analysis is applied. Use it for keyword fields, IDs, and exact matches. match queries analyze the query string and then look up each resulting token — this is what you want for full-text search on text fields. query_string parses a query syntax string (like Lucene's) supporting AND, OR, NOT, phrase queries, wildcards, and field-specific searches in a single string. query_string is powerful but risky — it can throw parse errors on malformed input and has security implications if user-supplied.
Index templates define settings and mappings that are automatically applied when new indices are created matching a pattern. They are essential for time-based data (like logs) where you create a new index per day or month. A template can set the number of shards, replicas, refresh intervals, field mappings, and ILM policies. Use composable templates (7.8+) over legacy index templates for better control over priority and module merging. Templates are applied at index creation time — they never update existing indices.
ILM automates index management across its lifetime. Typical phases: Hot — active indexing and searching, high resources; Warm — read-only, reduced shards or force-merged; Cold — no new writes, searchable but with slower storage; Delete — remove old indices after retention period. ILM policies are attached to indices via index templates. Actions within phases include rollover (start new index when size/age threshold is hit), shrink (reduce primary shards), force merge, freeze, and delete.
Refresh makes in-memory indexed data searchable. It creates a new Lucene segment from the index buffer. Default interval is 1 second — this is why Elasticsearch is near-real-time, not truly real-time. Flush persists buffered data to disk as a new Lucene segment and clears the translog. It is about durability, not searchability. Force merge reduces the number of Lucene segments by merging many small ones into fewer large ones. It improves search performance on read-heavy indices but is I/O intensive and should be scheduled during maintenance windows.
A rollover index creates a new write index when the current one reaches a size, age, or doc-count threshold. Instead of appending to a single ever-growing index (which creates giant shards), you roll over to a fresh index. This is the core of ILM's hot phase — the rollover alias always points to the active write index while older indices transition through warm, cold, and delete phases. Example: PUT /my-index-000001 with alias my-index as write index; when it hits max_age: 7d, ILM creates my-index-000002 and switches the alias.
Nested documents are stored as separate Lucene documents with a shared doc ID, preserving the relationship between objects. Without nested mapping, objects are flattened — a query matching two conditions might match different child objects in the same parent. Nested queries treat the array as independent documents and correctly enforce must-match-on-same-child logic. Downside: nested queries are significantly slower because they require a join-like step. If you frequently query by fields inside nested objects and exact object-level matching matters, use nested. Otherwise, consider denormalization (flattening) for better performance.
Elasticsearch has multiple circuit breakers that abort requests that would consume too much heap memory, preventing OutOfMemoryError: parent circuit breaker — total heap across all breakers (defaults to 95% of heap); fielddata circuit breaker — estimated size to load fielddata into memory (default 40%); request circuit breaker — per-request memory allocations like buckets in aggregations (default 40%); inflight requests circuit breaker — total size of incoming requests (default 100%); accounting circuit breaker — memory held by segments not released on merge (default 100%). When a breaker trips, Elasticsearch returns a 429 status with a circuit_breaking_exception. Tuning these requires care — setting them too high defeats the purpose; too low causes spurious rejections.
Further Reading
Elasticsearch: The Definitive Guide — official comprehensive guide
- Lucene Documentation — understanding segments, merge policies, and analyzers
- Elasticsearch Query DSL docs — full leaf and compound query reference
- Index Lifecycle Management — hot-warm-cold tier management
- Search Tuning Guide — official performance tuning
- Understanding BM25 — relevance scoring deep dive
- Monitoring with Metricbeat — pre-built Elasticsearch monitoring dashboards
The inverted index is what makes Elasticsearch fast. Queries that would scan a relational database for minutes return in milliseconds. Sharding and replicas give you horizontal scalability and fault tolerance.
Category
Related Posts
Apache Solr: Enterprise Search Platform
Explore Apache Solr's powerful search capabilities including faceted search, relevance tuning, indexing strategies, and how it compares to Elasticsearch.
Search Scaling: Sharding, Routing, and Horizontal Growth
Learn how to scale search systems horizontally with index sharding strategies, query routing, replication patterns, and cluster management techniques.
Skip Lists: Layered Linked Lists for Fast Search
Understand skip lists as probabilistic alternatives to balanced trees, providing O(log n) search with simple implementation and lock-free variants.