Step-by-step guides for solving specific problems. Each document explains how to achieve a concrete goal.
Guide List#
Troubleshooting OutOfMemoryError
Diagnose and resolve the most common memory shortage errors in Spark.
- Distinguishing Driver OOM vs Executor OOM
- Optimizing memory settings
- Adjusting partition sizes
Fix performance degradation caused by data concentration in specific partitions.
- How to diagnose skew
- Salting techniques
- Enabling AQE skew join
Improve Spark job performance by reducing network I/O.
- Eliminating unnecessary shuffles
- Leveraging broadcast joins
- Optimizing partition count
Identify performance bottlenecks and diagnose root causes from each tab of the Spark UI.
- Trace bottlenecks in Jobs → Stages → Tasks order
- Diagnose data skew, GC issues, excessive shuffle
- UI access methods by environment (local/YARN/K8s)
How to Use These Guides#
Each guide is structured as follows:
- Problem Definition: When you need this guide
- Prerequisites: What you need before starting
- Step-by-Step Solution: Including commands and code
- Verification: How to confirm the problem is resolved
If you get stuck during troubleshooting, refer to the FAQ.