Intermediate12 min readJune 1, 2026Infrastructure

Grafana: Infrastructure Monitoring for Self-Hosted Environments

Learn how to deploy Grafana for self-hosted infrastructure monitoring. Covers container metrics, Cloudflare analytics, security dashboards, and production alerting configuration.

chart with high and low bars background

OBSERVABILITY

Why Observability Is Non-Negotiable for Self-Hosted Infrastructure

Running services without monitoring is operating blind. You may know your services are up — until they are not. Without metrics, the first sign of a problem is often an outage, a full disk, or a security incident that has been ongoing for days without your knowledge.

Grafana gives you a single pane of glass across your entire self-hosted stack. Container resource consumption, network traffic patterns, security events, and geographic visitor data all become visible in real time. Problems that would take hours to diagnose manually become immediately obvious on a well-designed dashboard.

For self-hosted infrastructure specifically, monitoring serves a second purpose beyond availability: security visibility. Anomalous traffic patterns, unexpected geographic origins, and sudden spikes in blocked requests are all early indicators of attacks or misconfigurations — but only if you have the data to see them. CrowdSec guide

The Observability Stack

A complete monitoring setup combines three components: a metrics collector that gathers data from containers and services, a time-series database that stores and queries that data efficiently, and Grafana as the visualization and alerting layer. Each component is replaceable — Grafana works with dozens of data sources including Prometheus, InfluxDB, Loki, and external APIs.

Monitoring Data Flow

Metrics flow from containers and external APIs through collectors and into Grafana's datasource layer, where they are queried and visualized in real time.

Container Metrics
Metrics Collector
Time-Series DB
Grafana Datasource
Dashboard & Alerts

From container metrics and Cloudflare API to unified Grafana dashboard

Pull-Based Collection

Prometheus scrapes metrics from exporters on a configurable interval. Services expose metrics endpoints — the collector pulls, never pushes.

Long-Term Retention

Time-series databases compress historical data efficiently. Months of metrics consume far less disk than raw logs while remaining queryable at any resolution.

External API Integration

Grafana's Infinity datasource queries external APIs — including Cloudflare's GraphQL Analytics API — and renders the results as native dashboard panels.

Alert Routing

Grafana evaluates alert rules on a schedule and routes notifications to configured channels. A single alert rule can trigger multiple notification channels simultaneously.

Monitoring Stack Deployment

From a blank server to full infrastructure visibility — the complete deployment sequence.

01

Deploy Grafana as a Docker container with a persistent volume for dashboard and datasource storage.

02

Deploy a metrics collector (Prometheus or compatible) and configure scrape targets for your containers and host system.

03

Install container and host exporters to expose metrics endpoints that the collector can scrape.

04

Configure Grafana datasources pointing to your metrics collector and any external APIs you want to visualize.

05

Build dashboards. Start with container resource consumption, then add security event panels, then external traffic data from Cloudflare.

Data Sources Supported100+
Dashboard Refresh Rate5s minimum
Alert ChannelsUnlimited
Historical RetentionConfigurable

DASHBOARDS

Building Meaningful Dashboards

Container Resource Dashboard

The foundation of infrastructure monitoring is container resource consumption. CPU usage, memory consumption, network I/O, and disk read/write rates for every container give you an immediate picture of your stack's health. Sudden spikes in any metric are the first indicator of a problem — whether a service is under attack, has a memory leak, or is experiencing unexpected load.

Organize container panels by service category rather than alphabetically. Infrastructure services (Traefik, CrowdSec, DNS) in one row, application services in another. This layout makes anomalies immediately visible without scrolling through unrelated services.

Cloudflare Analytics Dashboard

Cloudflare exposes a GraphQL Analytics API that returns request volume, cache hit rates, threat scores, and geographic distribution data for your zone. The Infinity datasource in Grafana can query this API directly and render the results as time-series panels, world maps, and stat cards.

A geographic traffic map built from Cloudflare data gives you real-time visibility into where your visitors come from — and where your attackers originate. Combined with CrowdSec ban data, this creates a complete picture of your security posture. CrowdSec guide

CSP Conflict Resolution

Grafana's default Content Security Policy conflicts with Traefik's global security headers middleware. The solution is a separate middleware variant for Grafana that removes the CSP header while keeping all other security headers intact. Apply this middleware to the Grafana router label and disable Grafana's own CSP enforcement via its configuration — the proxy-level headers take precedence. Traefik guide

Securing Your Grafana Deployment

A Grafana instance contains sensitive infrastructure data and must be hardened before exposure — even through a Cloudflare Tunnel.

Layer 01Access
Authentication↓ passes to next layer
Layer 02Network
Exposure Control↓ passes to next layer
Layer 03Headers
CSP Configuration↓ passes to next layer
Layer 04Data
Datasource Security
AccessAuthentication
  • Disable anonymous access entirely • Change default admin credentials immediately • Consider OAuth or SSO for team environments • Enable two-factor authentication for admin accounts
NetworkExposure Control
  • Route Grafana through Cloudflare Tunnel — never expose directly • Apply IP allowlist middleware for admin-only access if appropriate • Separate public dashboards from admin interface via different routes • Disable Grafana's built-in HTTP server on non-proxy ports
HeadersCSP Configuration
  • Use dedicated Traefik middleware without CSP for Grafana router • Set GF_SECURITY_CONTENT_SECURITY_POLICY=false in Grafana config to prevent header conflicts • All other security headers (HSTS, X-Frame-Options, Referrer-Policy) remain active • Test header delivery after any Grafana version update
DataDatasource Security
  • Store API tokens and credentials in Grafana's encrypted secret store • Never hardcode credentials in datasource configuration files • Rotate Cloudflare API tokens periodically • Restrict API token permissions to read-only analytics scope

Monitoring Stack Resource Requirements

Grafana RAM (idle)~150 MB
Grafana RAM (active)~300 MB
Prometheus RAM~200–500 MB
Metrics retention (30d)~2–5 GB
CPU impact<2% sustained
Scrape interval15s recommended
Dashboard load time<2s on local network

Grafana Production Checklist

Verify each item before considering your monitoring stack production-ready.

Grafana running with persistent volume — dashboards and datasources survive container restarts
Default admin credentials changed immediately after first login
Anonymous access disabled in Grafana configuration
Metrics collector scraping all target containers and host system
Cloudflare Analytics datasource configured with read-only API token
Dedicated CSP-free middleware applied to Grafana Traefik router
GF_SECURITY_CONTENT_SECURITY_POLICY=false set in Grafana environment variables
Container resource dashboard covering CPU, memory, network I/O for all services
Grafana accessible through Cloudflare Tunnel only — no direct port exposure
At least one alert rule configured for critical conditions (disk space, container down)
Grafana Monitoring Setup Guide for Self-Hosted | rasne