Glossary

TL;DR
Index/Document/Field: Corresponds to Table/Row/Column in RDB
Shard/Replica: Basic units of data distribution and replication
Analyzer/Tokenizer: Breaks text into searchable tokens
Query/Filter Context: Search methods distinguished by scoring calculation
Sorted alphabetically, each term links to related concept documents

Quick reference for Elasticsearch core terms. For detailed explanations, see the Concepts section.

A-E#

Aggregation#

Feature for grouping search results and calculating statistics. Similar to SQL’s GROUP BY. Three types: Bucket/Metric/Pipeline. → Aggregations | Query DSL

Alias#

An alternative name for an Index. Useful for zero-downtime index switching and multi-index search. Used with ILM. → Indexing Strategy

Analyzer#

Component that breaks text into Terms. Processes in order: Character Filter → Tokenizer → Token Filter. Use Nori analyzer for Korean. → Data Modeling

BM25 (Best Matching 25)#

Elasticsearch’s default Score calculation algorithm. Based on TF and IDF. Can be adjusted with Boosting. → Search Relevance

Boosting#

Technique of adding weight to the Score of specific fields or conditions. → Search Relevance

Bulk API#

API for indexing multiple Documents at once. Essential for performance. Use with Refresh control. → Indexing Strategy

Cluster#

A group of Elasticsearch servers consisting of one or more Nodes. State managed by Master Node. → Core Components | Cluster Management

Coordinating Node#

Node that receives search requests, distributes to Data Nodes, and merges results. All nodes perform this role by default. → Cluster Management

Data Node#

Node that stores actual data and performs search/Aggregation. Shards are assigned to it. → Cluster Management

Document#

JSON data unit stored in Elasticsearch. Similar to a Row in RDB. Stored within an Index. → Core Components

DSL (Domain Specific Language)#

JSON-based language for writing Elasticsearch Queries. Provides various queries like Bool, Match, Term. → Query DSL

F-M#

Field#

Individual data item within a Document. Similar to a Column in RDB. Type defined by Mapping. → Data Modeling

Filter Context#

Performs condition matching without Score calculation. Cached for excellent performance. Used with Query Context in Bool queries. → Query DSL

Flush#

Operation to permanently store memory buffer data to disk. Clears Translog. Distinct from Refresh. → Indexing Strategy

IDF (Inverse Document Frequency)#

Indicator of how rare a word is across all Documents. Component of BM25. → Search Relevance

ILM (Index Lifecycle Management)#

Automatic management of Index lifecycle from creation to deletion. Hot → Warm → Cold → Delete phases. → Indexing Strategy

Index#

Collection of Documents. Similar to a Table in RDB. Distributed storage via Shards. → Core Components

Inverted Index#

Data structure mapping Terms → Document locations. Core of fast search. → Core Components

kNN (k-Nearest Neighbors)#

Vector similarity-based search. Algorithm that finds the k closest documents in Vector Search. → Vector Search

Mapping#

Defines how Documents and Fields are stored/indexed. Similar to Schema in RDB. Dynamic/Explicit methods. → Data Modeling

Master Node#

Node that manages Cluster state and handles Index creation/deletion. Recommended to separate from Data Node. → Cluster Management

N-R#

Node#

Single Elasticsearch server that forms a Cluster. Roles include Master, Data, Coordinating. → Core Components | Cluster Management

Nori#

Official Elasticsearch Korean morphological Analyzer. Provides nori_tokenizer, nori_part_of_speech filter. Used for autocomplete, initial consonant search. → Korean Search Optimization

Primary Shard#

Shard where original data is stored. Count cannot be changed after creation. Source of Replica Shard. → Core Components

Query Context#

Calculates relevance Score between search term and Document. Used with Filter Context in Bool queries. → Query DSL

Refresh#

Operation to make memory buffer data searchable. Default 1 second. Recommended to adjust when using Bulk API. Distinct from Flush. → Indexing Strategy | Performance Tuning

Reindex#

Copy/transform existing Index to new index. Used for Mapping changes, data migration. → Indexing Strategy

Replica Shard#

Copy of Primary Shard. Improves read performance and failover. Placed on different Nodes in Cluster. → High Availability

S-Z#

Score#

Number indicating relevance between search term and Document. Calculated by BM25 algorithm. Adjustable with Boosting. → Search Relevance

Segment#

Immutable file piece that composes an Index. Created during Refresh. Consolidated by Merge. → Performance Tuning

Shard#

Horizontal partition of an Index. Unit of distributed storage and parallel processing. Divided into Primary and Replica. → Core Components

Snapshot#

Backup of Index state at a specific point. Stored in remote storage (S3, GCS, etc.). Automated with SLM. → High Availability

TF (Term Frequency)#

Frequency of Term appearing in a Document. Component of BM25. → Search Relevance

Term#

Individual token generated after Analyzer processing. Stored in Inverted Index. → Data Modeling

Tokenizer#

Component of Analyzer that breaks text into tokens. Standard, Whitespace, Nori, etc. → Data Modeling

Translog#

Write-Ahead Log for preventing data loss. Used for recovery until Flush. → High Availability

Vector Search#

Semantic search using embedding vectors. Uses kNN algorithm. Used for semantic search, similar product recommendations. → Vector Search

Abbreviations#

Abbr	Full Name	Meaning	Reference
BM25	Best Matching 25	Default scoring algorithm	Search Relevance
CCR	Cross-Cluster Replication	Real-time cross-cluster replication	High Availability
DSL	Domain Specific Language	Query language	Query DSL
IDF	Inverse Document Frequency	Word rarity indicator	Search Relevance
ILM	Index Lifecycle Management	Index lifecycle management	Indexing Strategy
kNN	k-Nearest Neighbors	k-nearest neighbor search	Vector Search
SLM	Snapshot Lifecycle Management	Snapshot lifecycle management	High Availability
TF	Term Frequency	Word frequency indicator	Search Relevance

Next Steps#

Concepts - Elasticsearch core concepts
Quick Start - Quick start guide
References - Official docs, blogs
FAQ - Frequently asked questions