Version Information

This guide is written based on the following versions:

  • Elasticsearch: 8.11.x
  • Kibana: 8.11.x
  • Spring Boot: 3.2.x
  • Spring Data Elasticsearch: 5.2.x
  • Java: 17+

Some APIs or configurations may differ in other versions. In particular, Elasticsearch 7.x and 8.x have significant differences in security settings and client APIs.

What is Elasticsearch?#

Elasticsearch is a distributed search and analytics engine. It’s a tool that enables fast searching across large volumes of data and real-time analysis.

Why is Elasticsearch Needed?#

What problems arise when searching with LIKE '%keyword%' in an RDB?

RDB Search LimitationsAfter Elasticsearch Adoption
Slow due to full table scanMillisecond search via Inverted Index
No morphological analysisSearching “Samsung Electronics” matches both “Samsung” and “Electronics”
No typo toleranceFuzzy search finds “Samsng” too
No relevance sortingAccurate results ranked by relevance score
Single server limitationsHorizontal scaling via sharding, handles billions of documents

Elasticsearch solves these problems while providing real-time search, complex aggregations, and high availability.

When Should You Use Elasticsearch?#

Suitable cases:

  • When full-text search for products, posts, etc. is needed
  • When time-series data analysis for logs, metrics is needed
  • When real-time aggregations for dashboards are needed
  • When autocomplete, typo correction, synonym handling is needed
  • When RDB search becomes slow due to large data volume

May be overkill:

  • When only simple CRUD is needed (RDB is sufficient)
  • When transaction integrity is critical (Elasticsearch is Eventually Consistent)
  • When data volume is small and search requirements are simple
  • When there’s no capacity for operational infrastructure

Elasticsearch Limitations and Considerations#

Realistic drawbacks you should know before adopting Elasticsearch:

LimitationDescriptionMitigation
Operational ComplexityCluster management, shard rebalancing, JVM tuning requiredDedicated personnel or managed services
CostMemory-intensive, minimum 4GB+ per node recommendedProper capacity planning for data scale
Data ConsistencyEventually Consistent, not real-time (default 1-second refresh)Adjust refresh settings if real-time is critical
Schema ChangesCannot change existing field types, requires reindexingInitial Mapping design is crucial
No JOIN SupportCannot JOIN between tables, denormalization requiredData model redesign, Application-side JOIN
No Transaction SupportNo ACID transactionsKeep RDB as primary, ES for search only
Learning CurveQuery DSL, Mapping, Analyzers require learningConsider team capabilities

Practical Advice: The safest pattern is using Elasticsearch as a “search-only secondary store” while maintaining RDB as the main store. Core service functionality remains even if ES fails.

Alternative Technology Comparison#

Elasticsearch is not the only option:

TechnologyCharacteristicsBest For
ElasticsearchFull-stack search/analytics, most featuresLarge-scale search, log analysis, complex aggregations
OpenSearchFork of ES 7.10, AWS managed availableAWS environments, licensing concerns
Apache SolrLong history, proven stabilityTraditional enterprise environments
MeilisearchSimple setup, quick startSmall scale, prototypes, instant search
TypesenseEasy configuration, built-in typo toleranceSmall services, rapid implementation
PostgreSQL FTSNo separate system neededSimple search, already using PG

Selection Criteria: If data volume < 1 million documents and search requirements are simple, PostgreSQL FTS or Meilisearch may suffice. Elasticsearch is suitable for large-scale, complex requirements.

RDB vs Elasticsearch#

ConceptRDBElasticsearch
Storage UnitRowDocument (JSON)
SchemaTable SchemaMapping
TableTableIndex
ColumnColumnField
IndexB-Tree IndexInverted Index
JoinJOINNested, Parent-Child (limited)
TransactionACIDEventually Consistent

Key Difference: RDB is optimized for accurate retrieval of normalized data, while Elasticsearch is optimized for fast search of denormalized data.

What This Guide Covers#

Quick Start#

Store and search data in Elasticsearch in 5 minutes. See it working before diving into concepts.

Concepts#

Not just “use it this way”, but explaining why it works this way.

TopicWhat You’ll Learn
Core ComponentsRoles and relationships of Cluster, Node, Index, Document, Shard
Data ModelingMapping, Field Type, Analyzer design
Query DSLWriting search queries with Match, Term, Bool
Search RelevanceImproving search quality with Score, BM25, Boosting
AggregationsData analysis with Bucket and Metric aggregations
Indexing StrategyBulk indexing, Refresh, ILM settings
Cluster ManagementNode configuration, shard allocation, status monitoring
Performance TuningQuery optimization, caching, JVM settings
High AvailabilityReplica, Snapshot, failure response

Hands-on Examples#

Executable example code based on Spring Boot.

Appendix#

  • Glossary - Quick reference for Elasticsearch terms
  • FAQ - Frequently asked questions
  • References - Official docs and additional learning resources

Prerequisites#

  • Required: REST API basics, JSON format understanding
  • Helpful: Java/Spring Boot experience, RDB usage experience

Suggested Learning Path#

If you're new:      Quick Start → Core Components → Data Modeling → Basic Examples
Search implementation: Query DSL → Search Relevance → Product Search System
Data analysis:      Aggregations → Indexing Strategy
Operations prep:    Cluster Management → Performance Tuning → High Availability

Each document can be read independently, but if you’re new, we recommend the order above.