<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Concepts on Advanced Beginner</title><link>https://advanced-beginner.github.io/en/docs/observability/concepts/</link><description>Recent content in Concepts on Advanced Beginner</description><generator>Hugo</generator><language>en-US</language><managingEditor>d8lzz1gpw@mozmail.com (kimbenji)</managingEditor><webMaster>d8lzz1gpw@mozmail.com (kimbenji)</webMaster><lastBuildDate>Mon, 23 Mar 2026 19:08:15 +0900</lastBuildDate><atom:link href="https://advanced-beginner.github.io/en/docs/observability/concepts/index.xml" rel="self" type="application/rss+xml"/><item><title>Three Pillars of Observability</title><link>https://advanced-beginner.github.io/en/docs/observability/concepts/three-pillars/</link><pubDate>Mon, 12 Jan 2026 00:00:00 +0000</pubDate><author>d8lzz1gpw@mozmail.com (kimbenji)</author><guid>https://advanced-beginner.github.io/en/docs/observability/concepts/three-pillars/</guid><description>&lt;blockquote class='book-hint '&gt;
&lt;p&gt;&lt;strong&gt;Target Audience&lt;/strong&gt;: Developers new to Observability concepts
&lt;strong&gt;Prerequisites&lt;/strong&gt;: Basic understanding of web application architecture
&lt;strong&gt;After Reading&lt;/strong&gt;: You&amp;rsquo;ll understand the role of each pillar and know when to use which&lt;/p&gt;
&lt;/blockquote&gt;&lt;h2 id="tldr"&gt;TL;DR&lt;a class="anchor" href="#tldr"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;blockquote class="book-hint info"&gt;&lt;p&gt;&lt;strong&gt;Key Summary:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Metrics&lt;/strong&gt;: &amp;ldquo;How much?&amp;rdquo; - Numerically measurable states (CPU 80%, response time 200ms)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Logs&lt;/strong&gt;: &amp;ldquo;What happened?&amp;rdquo; - Detailed records of individual events&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Traces&lt;/strong&gt;: &amp;ldquo;From where to where?&amp;rdquo; - Tracking the entire path of a request&lt;/li&gt;
&lt;li&gt;The 3 pillars are &lt;strong&gt;complementary&lt;/strong&gt;, most effective when used together&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2 id="why-are-all-three-pillars-necessary"&gt;Why Are All Three Pillars Necessary?&lt;a class="anchor" href="#why-are-all-three-pillars-necessary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Analogy: A Doctor&amp;rsquo;s Diagnostic Process&lt;/strong&gt;&lt;/p&gt;</description></item><item><title>Metrics Fundamentals</title><link>https://advanced-beginner.github.io/en/docs/observability/concepts/metrics-fundamentals/</link><pubDate>Mon, 12 Jan 2026 00:00:00 +0000</pubDate><author>d8lzz1gpw@mozmail.com (kimbenji)</author><guid>https://advanced-beginner.github.io/en/docs/observability/concepts/metrics-fundamentals/</guid><description>&lt;blockquote class='book-hint '&gt;
&lt;p&gt;&lt;strong&gt;Target Audience&lt;/strong&gt;: Developers designing Prometheus metrics for the first time
&lt;strong&gt;Prerequisites&lt;/strong&gt;: &lt;a href="https://advanced-beginner.github.io/en/docs/observability/concepts/three-pillars/"&gt;Three Pillars of Observability&lt;/a&gt;
&lt;strong&gt;After Reading&lt;/strong&gt;: You&amp;rsquo;ll be able to select the appropriate metric type and implement it correctly&lt;/p&gt;
&lt;/blockquote&gt;&lt;h2 id="tldr"&gt;TL;DR&lt;a class="anchor" href="#tldr"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;blockquote class="book-hint info"&gt;&lt;p&gt;&lt;strong&gt;Key Summary:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Counter&lt;/strong&gt;: Only cumulative increase (request count, error count) → Calculate rate of change with &lt;code&gt;rate()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Gauge&lt;/strong&gt;: Current value (temperature, memory) → Use as-is or &lt;code&gt;avg()&lt;/code&gt; for average&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Histogram&lt;/strong&gt;: Distribution measurement (response time) → Calculate percentiles with &lt;code&gt;histogram_quantile()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Summary&lt;/strong&gt;: Percentile calculation on client (rarely used)&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2 id="why-are-metric-types-important"&gt;Why Are Metric Types Important?&lt;a class="anchor" href="#why-are-metric-types-important"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Metric types are not just a technical choice. They &lt;strong&gt;reflect the nature of the data&lt;/strong&gt;.&lt;/p&gt;</description></item><item><title>Prometheus Architecture</title><link>https://advanced-beginner.github.io/en/docs/observability/concepts/prometheus-architecture/</link><pubDate>Mon, 12 Jan 2026 00:00:00 +0000</pubDate><author>d8lzz1gpw@mozmail.com (kimbenji)</author><guid>https://advanced-beginner.github.io/en/docs/observability/concepts/prometheus-architecture/</guid><description>&lt;blockquote class='book-hint '&gt;
&lt;p&gt;&lt;strong&gt;Target Audience&lt;/strong&gt;: Developers who want to operate or deeply understand Prometheus
&lt;strong&gt;Prerequisites&lt;/strong&gt;: &lt;a href="https://advanced-beginner.github.io/en/docs/observability/concepts/metrics-fundamentals/"&gt;Metrics Fundamentals&lt;/a&gt;
&lt;strong&gt;After Reading&lt;/strong&gt;: You&amp;rsquo;ll understand Prometheus design philosophy and components, and be able to plan operational strategies&lt;/p&gt;
&lt;/blockquote&gt;&lt;h2 id="tldr"&gt;TL;DR&lt;a class="anchor" href="#tldr"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;blockquote class="book-hint info"&gt;&lt;p&gt;&lt;strong&gt;Key Summary:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Pull Model&lt;/strong&gt;: Prometheus fetches metrics from targets (not Push)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Time Series DB&lt;/strong&gt;: Label-based multidimensional data model&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Service Discovery&lt;/strong&gt;: Auto-discover targets with Kubernetes, Consul, etc.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Single Server Design&lt;/strong&gt;: Optimized for single server rather than horizontal scaling (extend with Federation)&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2 id="prometheus-overall-structure"&gt;Prometheus Overall Structure&lt;a class="anchor" href="#prometheus-overall-structure"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;pre class="mermaid"&gt;graph TB
 subgraph &amp;#34;Data Collection&amp;#34;
 T1[&amp;#34;Target 1&amp;lt;br&amp;gt;/metrics&amp;#34;]
 T2[&amp;#34;Target 2&amp;lt;br&amp;gt;/metrics&amp;#34;]
 T3[&amp;#34;Target 3&amp;lt;br&amp;gt;/metrics&amp;#34;]
 PG[&amp;#34;Pushgateway&amp;lt;br&amp;gt;(for batch jobs)&amp;#34;]
 end

 subgraph &amp;#34;Prometheus Server&amp;#34;
 R[&amp;#34;Retrieval&amp;lt;br&amp;gt;(Scraper)&amp;#34;]
 TSDB[&amp;#34;TSDB&amp;lt;br&amp;gt;(Time Series DB)&amp;#34;]
 HTTP[&amp;#34;HTTP Server&amp;lt;br&amp;gt;(PromQL API)&amp;#34;]
 R --&amp;gt; TSDB
 TSDB --&amp;gt; HTTP
 end

 subgraph &amp;#34;Service Discovery&amp;#34;
 K8S[&amp;#34;Kubernetes&amp;#34;]
 CONSUL[&amp;#34;Consul&amp;#34;]
 FILE[&amp;#34;File SD&amp;#34;]
 end

 subgraph &amp;#34;Alerting&amp;#34;
 AM[&amp;#34;Alertmanager&amp;#34;]
 SLACK[&amp;#34;Slack&amp;#34;]
 PD[&amp;#34;PagerDuty&amp;#34;]
 end

 subgraph &amp;#34;Visualization&amp;#34;
 GF[&amp;#34;Grafana&amp;#34;]
 end

 T1 --&amp;gt; |&amp;#34;pull&amp;#34;| R
 T2 --&amp;gt; |&amp;#34;pull&amp;#34;| R
 T3 --&amp;gt; |&amp;#34;pull&amp;#34;| R
 PG --&amp;gt; |&amp;#34;pull&amp;#34;| R

 K8S --&amp;gt; |&amp;#34;target list&amp;#34;| R
 CONSUL --&amp;gt; |&amp;#34;target list&amp;#34;| R
 FILE --&amp;gt; |&amp;#34;target list&amp;#34;| R

 TSDB --&amp;gt; |&amp;#34;alerting rules&amp;#34;| AM
 AM --&amp;gt; SLACK
 AM --&amp;gt; PD

 HTTP --&amp;gt; |&amp;#34;PromQL&amp;#34;| GF&lt;/pre&gt;&lt;hr&gt;
&lt;h2 id="why-pull-model"&gt;Why Pull Model?&lt;a class="anchor" href="#why-pull-model"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="two-philosophies-of-monitoring"&gt;Two Philosophies of Monitoring&lt;a class="anchor" href="#two-philosophies-of-monitoring"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;There are broadly two philosophies for collecting metrics.&lt;/p&gt;</description></item><item><title>Log Aggregation</title><link>https://advanced-beginner.github.io/en/docs/observability/concepts/log-aggregation/</link><pubDate>Mon, 12 Jan 2026 00:00:00 +0000</pubDate><author>d8lzz1gpw@mozmail.com (kimbenji)</author><guid>https://advanced-beginner.github.io/en/docs/observability/concepts/log-aggregation/</guid><description>&lt;blockquote class='book-hint '&gt;
&lt;p&gt;&lt;strong&gt;Target Audience&lt;/strong&gt;: Developers and SREs designing log systems
&lt;strong&gt;Prerequisites&lt;/strong&gt;: &lt;a href="https://advanced-beginner.github.io/en/docs/observability/concepts/three-pillars/"&gt;Three Pillars of Observability&lt;/a&gt;
&lt;strong&gt;After Reading&lt;/strong&gt;: You&amp;rsquo;ll be able to select a log collection system and design effective logs&lt;/p&gt;
&lt;/blockquote&gt;&lt;h2 id="tldr"&gt;TL;DR&lt;a class="anchor" href="#tldr"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;blockquote class="book-hint info"&gt;&lt;p&gt;&lt;strong&gt;Key Summary:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Loki&lt;/strong&gt;: Label-based, lightweight, excellent Grafana integration&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ELK&lt;/strong&gt;: Powerful full-text search, suitable for large-scale analysis&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Structured Logs&lt;/strong&gt;: JSON format for easy field-by-field search&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Log Levels&lt;/strong&gt;: Long-term retention only for ERROR and above recommended&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2 id="why-is-log-aggregation-necessary"&gt;Why Is Log Aggregation Necessary?&lt;a class="anchor" href="#why-is-log-aggregation-necessary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;In microservices environments, applications run across dozens or hundreds of containers. If each container generates its own log file, which server&amp;rsquo;s log should you look at when an incident occurs?&lt;/p&gt;</description></item><item><title>Distributed Tracing</title><link>https://advanced-beginner.github.io/en/docs/observability/concepts/distributed-tracing/</link><pubDate>Mon, 12 Jan 2026 00:00:00 +0000</pubDate><author>d8lzz1gpw@mozmail.com (kimbenji)</author><guid>https://advanced-beginner.github.io/en/docs/observability/concepts/distributed-tracing/</guid><description>&lt;blockquote class='book-hint '&gt;
&lt;p&gt;&lt;strong&gt;Target Audience&lt;/strong&gt;: Developers and SREs operating microservices
&lt;strong&gt;Prerequisites&lt;/strong&gt;: &lt;a href="https://advanced-beginner.github.io/en/docs/observability/concepts/three-pillars/"&gt;Three Pillars of Observability&lt;/a&gt;
&lt;strong&gt;After Reading&lt;/strong&gt;: You&amp;rsquo;ll understand distributed tracing and be able to analyze request flows between services&lt;/p&gt;
&lt;/blockquote&gt;&lt;h2 id="tldr"&gt;TL;DR&lt;a class="anchor" href="#tldr"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;blockquote class="book-hint info"&gt;&lt;p&gt;&lt;strong&gt;Key Summary:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Trace&lt;/strong&gt;: Entire path of one request (composed of multiple Spans)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Span&lt;/strong&gt;: Single unit of work (start/end time, metadata)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Context Propagation&lt;/strong&gt;: Passing Trace ID between services&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sampling&lt;/strong&gt;: Store only a portion of total traces (cost optimization)&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2 id="why-is-distributed-tracing-necessary"&gt;Why Is Distributed Tracing Necessary?&lt;a class="anchor" href="#why-is-distributed-tracing-necessary"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;In microservices, a single request passes through multiple services. It&amp;rsquo;s hard to identify where delays occur.&lt;/p&gt;</description></item><item><title>OpenTelemetry</title><link>https://advanced-beginner.github.io/en/docs/observability/concepts/opentelemetry/</link><pubDate>Mon, 12 Jan 2026 00:00:00 +0000</pubDate><author>d8lzz1gpw@mozmail.com (kimbenji)</author><guid>https://advanced-beginner.github.io/en/docs/observability/concepts/opentelemetry/</guid><description>&lt;blockquote class='book-hint '&gt;
&lt;p&gt;&lt;strong&gt;Target Audience&lt;/strong&gt;: Developers and SREs standardizing observability systems
&lt;strong&gt;Prerequisites&lt;/strong&gt;: &lt;a href="https://advanced-beginner.github.io/en/docs/observability/concepts/three-pillars/"&gt;Three Pillars of Observability&lt;/a&gt;, &lt;a href="https://advanced-beginner.github.io/en/docs/observability/concepts/distributed-tracing/"&gt;Distributed Tracing&lt;/a&gt;
&lt;strong&gt;After Reading&lt;/strong&gt;: You&amp;rsquo;ll understand OpenTelemetry and be able to apply it to your projects&lt;/p&gt;
&lt;/blockquote&gt;&lt;h2 id="tldr"&gt;TL;DR&lt;a class="anchor" href="#tldr"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;blockquote class="book-hint info"&gt;&lt;p&gt;&lt;strong&gt;Key Summary:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;OpenTelemetry (OTel)&lt;/strong&gt;: &lt;strong&gt;Vendor-neutral standard&lt;/strong&gt; for Metrics, Logs, and Traces&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Components&lt;/strong&gt;: SDK (instrumentation) + Collector (collect/transform/export)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Advantages&lt;/strong&gt;: No vendor lock-in, instrument once to support multiple backends&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CNCF Project&lt;/strong&gt;: Second most active project after Kubernetes&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2 id="what-is-opentelemetry"&gt;What Is OpenTelemetry?&lt;a class="anchor" href="#what-is-opentelemetry"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;OpenTelemetry is a &lt;strong&gt;standard framework for generating, collecting, and exporting observability data&lt;/strong&gt;.&lt;/p&gt;</description></item><item><title>Dashboard Design</title><link>https://advanced-beginner.github.io/en/docs/observability/concepts/dashboard-design/</link><pubDate>Mon, 12 Jan 2026 00:00:00 +0000</pubDate><author>d8lzz1gpw@mozmail.com (kimbenji)</author><guid>https://advanced-beginner.github.io/en/docs/observability/concepts/dashboard-design/</guid><description>&lt;blockquote class='book-hint '&gt;
&lt;p&gt;&lt;strong&gt;Target Audience&lt;/strong&gt;: Developers and SREs designing Grafana dashboards
&lt;strong&gt;Prerequisites&lt;/strong&gt;: &lt;a href="https://advanced-beginner.github.io/en/docs/observability/concepts/golden-signals/"&gt;SRE Golden Signals&lt;/a&gt;
&lt;strong&gt;After Reading&lt;/strong&gt;: You&amp;rsquo;ll be able to design effective dashboards and quickly identify problems&lt;/p&gt;
&lt;/blockquote&gt;&lt;h2 id="tldr"&gt;TL;DR&lt;a class="anchor" href="#tldr"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;blockquote class="book-hint info"&gt;&lt;p&gt;&lt;strong&gt;Key Summary:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Hierarchical Structure&lt;/strong&gt;: Overview → Service → Detail order&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;5-Second Rule&lt;/strong&gt;: Must be able to identify problem presence in 5 seconds&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Golden Signals First&lt;/strong&gt;: Latency, Traffic, Errors, Saturation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Remove Unnecessary Info&lt;/strong&gt;: Exclude metrics that don&amp;rsquo;t lead to action&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2 id="dashboard-design-principles"&gt;Dashboard Design Principles&lt;a class="anchor" href="#dashboard-design-principles"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="1-5-second-rule"&gt;1. 5-Second Rule&lt;a class="anchor" href="#1-5-second-rule"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Within &lt;strong&gt;5 seconds&lt;/strong&gt; of viewing the dashboard, you should know:&lt;/p&gt;</description></item></channel></rss>