Skip to content

All GuidesInfrastructure

Grafana: Infrastructure Monitoring for Self-Hosted Environments

Learn how to deploy Grafana for self-hosted infrastructure monitoring. Covers container metrics, Cloudflare analytics, security dashboards, and production alerting configuration.

06 / 01 / 202612 min read

chart with high and low bars background

Feature image

OBSERVABILITY

Why Observability Is Non-Negotiable for Self-Hosted Infrastructure

Running services without monitoring is operating blind. You may know your services are up — until they are not. Without metrics, the first sign of a problem is often an outage, a full disk, or a security incident that has been ongoing for days without your knowledge.Grafana gives you a single pane of glass across your entire self-hosted stack. Container resource consumption, network traffic patterns, security events, and geographic visitor data all become visible in real time. Problems that would take hours to diagnose manually become immediately obvious on a well-designed dashboard.For self-hosted infrastructure specifically, monitoring serves a second purpose beyond availability: security visibility. Anomalous traffic patterns, unexpected geographic origins, and sudden spikes in blocked requests are all early indicators of attacks or misconfigurations — but only if you have the data to see them. CrowdSec guide

The Observability Stack

A complete monitoring setup combines three components: a metrics collector that gathers data from containers and services, a time-series database that stores and queries that data efficiently, and Grafana as the visualization and alerting layer. Each component is replaceable — Grafana works with dozens of data sources including Prometheus, InfluxDB, Loki, and external APIs.

Monitoring Data Flow

Metrics flow from containers and external APIs through collectors and into Grafana's datasource layer, where they are queried and visualized in real time.

01

Container Metrics

→

02

Metrics Collector

→

03

Time-Series DB

→

04

Grafana Datasource

→

05

Dashboard & Alerts

→

From container metrics and Cloudflare API to unified Grafana dashboard

Pull-Based Collection

Prometheus scrapes metrics from exporters on a configurable interval. Services expose metrics endpoints — the collector pulls, never pushes.

Long-Term Retention

Time-series databases compress historical data efficiently. Months of metrics consume far less disk than raw logs while remaining queryable at any resolution.

External API Integration

Grafana's Infinity datasource queries external APIs — including Cloudflare's GraphQL Analytics API — and renders the results as native dashboard panels.

Alert Routing

Grafana evaluates alert rules on a schedule and routes notifications to configured channels. A single alert rule can trigger multiple notification channels simultaneously.

Monitoring Stack Deployment

From a blank server to full infrastructure visibility — the complete deployment sequence.

01

Deploy Grafana as a Docker container with a persistent volume for dashboard and datasource storage.

02

Deploy a metrics collector (Prometheus or compatible) and configure scrape targets for your containers and host system.

03

Install container and host exporters to expose metrics endpoints that the collector can scrape.

04

Configure Grafana datasources pointing to your metrics collector and any external APIs you want to visualize.

05

Build dashboards. Start with container resource consumption, then add security event panels, then external traffic data from Cloudflare.

Data Sources Supported

100+

Dashboard Refresh Rate

5s minimum

Alert Channels

Unlimited

Historical Retention

Configurable

DASHBOARDS

Building Meaningful Dashboards

Container Resource Dashboard

The foundation of infrastructure monitoring is container resource consumption. CPU usage, memory consumption, network I/O, and disk read/write rates for every container give you an immediate picture of your stack's health. Sudden spikes in any metric are the first indicator of a problem — whether a service is under attack, has a memory leak, or is experiencing unexpected load.

Organize container panels by service category rather than alphabetically. Infrastructure services (Traefik, CrowdSec, DNS) in one row, application services in another. This layout makes anomalies immediately visible without scrolling through unrelated services.

Cloudflare Analytics Dashboard

Cloudflare exposes a GraphQL Analytics API that returns request volume, cache hit rates, threat scores, and geographic distribution data for your zone. The Infinity datasource in Grafana can query this API directly and render the results as time-series panels, world maps, and stat cards.

A geographic traffic map built from Cloudflare data gives you real-time visibility into where your visitors come from — and where your attackers originate. Combined with CrowdSec ban data, this creates a complete picture of your security posture. CrowdSec guide

CSP Conflict Resolution

Grafana's default Content Security Policy conflicts with Traefik's global security headers middleware. The solution is a separate middleware variant for Grafana that removes the CSP header while keeping all other security headers intact. Apply this middleware to the Grafana router label and disable Grafana's own CSP enforcement via its configuration — the proxy-level headers take precedence. Traefik guide

Securing Your Grafana Deployment

A Grafana instance contains sensitive infrastructure data and must be hardened before exposure — even through a Cloudflare Tunnel.

Layer 01 · Access

Authentication

Disable anonymous access entirely • Change default admin credentials immediately • Consider OAuth or SSO for team environments • Enable two-factor authentication for admin accounts

Layer 02 · Network

Exposure Control

Route Grafana through Cloudflare Tunnel — never expose directly • Apply IP allowlist middleware for admin-only access if appropriate • Separate public dashboards from admin interface via different routes • Disable Grafana's built-in HTTP server on non-proxy ports

Layer 03 · Headers

CSP Configuration

Use dedicated Traefik middleware without CSP for Grafana router • Set GF_SECURITY_CONTENT_SECURITY_POLICY=false in Grafana config to prevent header conflicts • All other security headers (HSTS, X-Frame-Options, Referrer-Policy) remain active • Test header delivery after any Grafana version update

Layer 04 · Data

Datasource Security

Store API tokens and credentials in Grafana's encrypted secret store • Never hardcode credentials in datasource configuration files • Rotate Cloudflare API tokens periodically • Restrict API token permissions to read-only analytics scope

Monitoring Stack Resource Requirements

Grafana RAM (idle)

~150 MB

Grafana RAM (active)

~300 MB

Prometheus RAM

~200–500 MB

Metrics retention (30d)

~2–5 GB

CPU impact

<2% sustained

Scrape interval

15s recommended

Dashboard load time

<2s on local network

Grafana Production Checklist

Verify each item before considering your monitoring stack production-ready.

Grafana running with persistent volume — dashboards and datasources survive container restarts

Default admin credentials changed immediately after first login

Anonymous access disabled in Grafana configuration

Metrics collector scraping all target containers and host system

Cloudflare Analytics datasource configured with read-only API token

Dedicated CSP-free middleware applied to Grafana Traefik router

GF_SECURITY_CONTENT_SECURITY_POLICY=false set in Grafana environment variables

Container resource dashboard covering CPU, memory, network I/O for all services

Grafana accessible through Cloudflare Tunnel only — no direct port exposure

At least one alert rule configured for critical conditions (disk space, container down)

Continue reading

Many USB Sticks

Intermediate14 min

Production Backup Strategy for Self-Hosted

Most self-hosted setups have no tested recovery plan. This guide covers the complete backup architecture — from encrypted deduplication to database-consistent snapshots and verified disaster recovery — built for infrastructure that cannot afford downtime.

Digital locked vault

Vaultwarden: Self-Hosted Password Management Done Right

A complete guide to deploying Vaultwarden — the lightweight Bitwarden-compatible server — on your own infrastructure. Covers installation, HTTPS configuration, backup strategy, and hardening for a production-ready password manager.

Server room data station technology background

Intermediate14 min

Traefik: Production-Grade Reverse Proxy for Self-Hosted Infrastructure

A complete guide to deploying Traefik as your self-hosted reverse proxy. Covers Docker provider configuration, automatic TLS, security headers, middleware, and integration with Cloudflare Tunnel for zero-port-exposure routing.