Step-by-step guides for solving specific problems. Each document explains how to achieve a concrete goal.
Guide List#
Troubleshooting OutOfMemoryError
Diagnose and resolve the most common memory shortage errors in Spark.
- Distinguishing Driver OOM vs Executor OOM
- Optimizing memory settings
- Adjusting partition sizes
Fix performance degradation caused by data concentration in specific partitions.
- How to diagnose skew
- Salting techniques
- Enabling AQE skew join
Improve Spark job performance by reducing network I/O.
- Eliminating unnecessary shuffles
- Leveraging broadcast joins
- Optimizing partition count
How to Use These Guides#
Each guide is structured as follows:
- Problem Definition: When you need this guide
- Prerequisites: What you need before starting
- Step-by-Step Solution: Including commands and code
- Verification: How to confirm the problem is resolved
If you get stuck during troubleshooting, refer to the FAQ.