A production-ready Helm chart for deploying a complete Zabbix monitoring stack on OpenShift with advanced security features including mutual TLS (mTLS) authentication between all components.
- Overview
- Architecture
- Components
- Technologies & Tools
- Project Structure
- Prerequisites
- Installation
- Configuration
- Version Compatibility
- Troubleshooting
- Development
This project implements a complete Zabbix monitoring infrastructure on OpenShift/Kubernetes using Helm charts. The deployment includes:
- Zabbix Server with PostgreSQL backend
- Zabbix Web Interface (Nginx + PHP)
- Zabbix Proxy for distributed monitoring
- Three types of agents deployed as DaemonSets on every node:
- Agent2 (Active) - Connects directly to Zabbix Server
- Agent2-Proxy (Active) - Connects to Zabbix Server via Proxy
- AgentD (Passive) - Monitored on-demand via Proxy
- Automated bootstrap for registering hosts and proxy
- Full mTLS encryption for all inter-component communication
✅ Multi-version support - Works with Zabbix 6.0.x, 7.0.x, and 7.4.x
✅ Automated certificate management - Self-signed CA and certificates generated automatically
✅ Zero-touch deployment - Hosts and proxy auto-registered via bootstrap job
✅ OpenShift security compliant - SCCs, RBAC, non-root containers
✅ Production-ready - Persistent storage, health checks, proper resource limits
✅ Flexible monitoring - Multiple agent types for different use cases
┌─────────────────────────────────────────────────────────────────────┐
│ OpenShift Cluster │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Zabbix Web │◄────────┤ Zabbix Server│◄─────────┐ │
│ │ (Nginx) │ │ (PostgreSQL)│ │ │
│ └──────────────┘ └──────────────┘ │ │
│ ▲ ▲ │ │
│ │ │ │ │
│ [Route/Ingress] │ │ │
│ ┌────┴────────┐ ┌────┴─────┐ │
│ │ Proxy │ │ mTLS │ │
│ │ (Active) │ │ Direct │ │
│ └──┬───────┬──┘ │Connection│ │
│ │ mTLS │ └─────▲────┘ │
│ ┌───────┘ └──────┐ │ │
│ │ │ │ │
│ ┌──────────────────────┴──────────────────────┴──────┴────────┐ │
│ │ Kubernetes Nodes │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Agent2 │ │Agent2-Proxy │ │AgentD-Proxy │ │ │
│ │ │ (Active) │ │ (Active) │ │ (Passive) │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ Direct │ │ Via │ │ Via │ │ │
│ │ │ to Server │ │ Proxy │ │ Proxy │ │ │
│ │ │ (mTLS) │ │ │ │ │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ (Deployed as DaemonSets on each node) │ │
│ └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
- Listens on port
10051for connections - Receives active checks from Agent2 instances
- Receives proxy data from Zabbix Proxy
- All connections secured with mTLS (certificate authentication)
- Connected to PostgreSQL database for data storage
- Operates in active mode (connects to Server)
- Listens on port
10051for agent connections - Forwards collected data to Zabbix Server
- Communicates with Server using mTLS:
TLS_CONNECT=1(certificate) - Proxy → ServerTLS_ACCEPT=4(certificate) - Server → Proxy
- Uses SQLite for local buffer storage
- Active agent - Initiates connection to Server
- Deployed as DaemonSet (one pod per node)
- Hostname:
<node-name>-agent2 - Sends metrics directly to Zabbix Server
- mTLS Configuration:
TLS_CONNECT=4(certificate)TLS_ACCEPT=4(certificate)
- Active agent - Initiates connection to Proxy
- Deployed as DaemonSet (one pod per node)
- Hostname:
<node-name>-agent2-proxy - Sends metrics to Proxy, which forwards to Server
- mTLS Configuration:
TLS_CONNECT=4(certificate)TLS_ACCEPT=4(certificate)
- Use case: Distributed monitoring, network segmentation
- Passive agent - Waits for Proxy to poll it
- Deployed as DaemonSet (one pod per node)
- Hostname:
<node-name>-agentd-proxy - Proxy connects to agent on port
10050to collect metrics - mTLS Configuration:
TLS_CONNECT=4(certificate)TLS_ACCEPT=4(certificate)
- Use case: Legacy monitoring, on-demand metrics collection
All components authenticate using X.509 certificates:
┌──────────────┐
│ CA (Root) │
│ ca.crt │
└───────┬──────┘
│
├─────────┐─────────┐─────────┐
│ │ │ │
┌───▼───┐ ┌──▼───┐ ┌──▼────┐ ┌──▼───┐
│Server │ │Proxy │ │Agent │ │Web │
│ Cert │ │Cert │ │ Cert │ │ UI │
└───────┘ └──────┘ └───────┘ └──────┘
Certificate Generation Process:
- Pre-install Hook -
certgen-jobruns before deployment - CA Creation - Self-signed root CA certificate generated
- Component Certs - Individual certificates for server, proxy, agent
- Secret Storage - Certificates stored in Kubernetes Secret
- Volume Mounts - Certificates mounted into all component pods
TLS Configuration Matrix:
| Component | TLS_CONNECT | TLS_ACCEPT | Direction |
|---|---|---|---|
| Server | N/A | 4 (cert) | Receives connections |
| Proxy | 1 (cert) | 4 (cert) | Bidirectional |
| Agent2 | 4 (cert) | 4 (cert) | Initiates to Server |
| Agent2-Proxy | 4 (cert) | 4 (cert) | Initiates to Proxy |
| AgentD-Proxy | 4 (cert) | 4 (cert) | Receives from Proxy |
- Deployment: 1 replica
- Image:
zabbix/zabbix-server-pgsql - Database: PostgreSQL (StatefulSet)
- Ports: 10051 (server)
- Storage: PostgreSQL persistent volume (5Gi)
- Deployment: 1 replica
- Image:
zabbix/zabbix-web-nginx-pgsql - Ports: 8080 (HTTP)
- Access: OpenShift Route
- Authentication: Admin/zabbix (default)
- Deployment: 1 replica
- Image:
zabbix/zabbix-proxy-sqlite3 - Ports: 10051 (proxy)
- Mode: Active (connects to server)
- Storage: SQLite (ephemeral)
- Image:
zabbix/zabbix-agent2 - Deployment: DaemonSet (all nodes)
- Naming:
<node-name>-agent2 - Connection: Direct to Server (port 10051)
- Mode: Active
- Image:
zabbix/zabbix-agent2 - Deployment: DaemonSet (all nodes)
- Naming:
<node-name>-agent2-proxy - Connection: Via Proxy (port 10051)
- Mode: Active
- Image:
zabbix/zabbix-agent - Deployment: DaemonSet (all nodes)
- Naming:
<node-name>-agentd-proxy - Connection: Via Proxy (port 10050)
- Mode: Passive
- Interface: Agent interface on port 10050
- Image:
bitnami/postgresql - Deployment: StatefulSet
- Storage: 5Gi persistent volume
- Credentials: zabbix/zabbix (configurable)
- Image:
bitnami/kubectl - Execution: Post-install/upgrade hook
- Purpose: Auto-register proxy and hosts
- API Version: Adapts to Zabbix 6.x vs 7.x
- Tools: curl, jq for JSON-RPC API calls
- Image:
bitnami/kubectl - Execution: Pre-install/upgrade hook
- Purpose: Generate CA and component certificates
- Tools: openssl
- OpenShift 4.x / Kubernetes 1.24+
- Helm 3.x for package management
- kubectl for manual operations
- Zabbix Server 6.0.x / 7.0.x / 7.4.x
- Zabbix Proxy (SQLite variant)
- Zabbix Agent2 (modern agent)
- Zabbix Agent (legacy agentd)
- Zabbix Web (Nginx + PHP-FPM)
- PostgreSQL 15+ (via Bitnami)
- SQLite (proxy local storage)
- Persistent Volumes (RWO)
- OpenSSL for certificate generation
- mTLS (mutual TLS) for all connections
- RBAC (ServiceAccount, RoleBinding)
- Security Context Constraints (OpenShift)
- Non-root containers (all components)
- Helm Hooks for job orchestration
- Zabbix API (JSON-RPC 2.0) for automation
- curl + jq for API scripting
- ClusterIP services (internal)
- OpenShift Routes (external access)
- Multus CNI (optional, for multi-network)
zabbix-helm-poc/
├── README.md # This file
├── BOOTSTRAP-DIFFERENCES.md # API version differences (6.x vs 7.x)
├── multus-cm.yaml # Optional: Multus CNI configuration
│
└── manifests/ # Helm chart root
├── Chart.yaml # Chart metadata
├── values.yaml # Default configuration values
│
└── templates/ # Kubernetes manifests
├── _helpers.tpl # Helm template helpers
│
├── certgen-job.yaml # Pre-install: Generate TLS certificates
├── zabbix-bootstrap-job.yaml # Post-install: Register hosts (Zabbix 7.0+)
├── zabbix-bootstrap-job-legacy.yaml # Post-install: Register hosts (Zabbix <7.0)
│
├── sa-rolebinding.yaml # RBAC: ServiceAccount + RoleBinding
│
├── postgres-statefulset.yaml # PostgreSQL database
├── postgres-service.yaml # PostgreSQL service
│
├── server-deployment.yaml # Zabbix Server
├── server-service.yaml # Server service (port 10051)
│
├── proxy-deployment.yaml # Zabbix Proxy
├── proxy-service.yaml # Proxy service (port 10051)
│
├── agent2-daemonset.yaml # Agent2 (direct to server)
├── agent2-service.yaml # Agent2 service
│
├── agent2-daemonset-proxy.yaml # Agent2 via proxy
├── agent2-service-proxy.yaml # Agent2-proxy service
│
├── agentd-daemonset-proxy.yaml # AgentD (passive, via proxy)
├── agentd-service-proxy.yaml # AgentD service
│
├── web-nginx-pgsql.yaml # Zabbix Web UI deployment
├── web-nginx-pgsql-service.yaml # Web UI service
└── route-web.yaml # OpenShift Route for web access
Chart.yaml- Helm chart metadata (name, version, description)values.yaml- Default values for all configurable parameters_helpers.tpl- Reusable template functions (naming, labels, version parsing)
certgen-job.yaml- Generates CA and all component certificates (runs first)zabbix-bootstrap-job.yaml- Modern bootstrap for Zabbix 7.0+ (Bearer auth)zabbix-bootstrap-job-legacy.yaml- Legacy bootstrap for Zabbix 6.x (auth in body)sa-rolebinding.yaml- Kubernetes RBAC permissions for jobs
postgres-statefulset.yaml- Database with persistent storageserver-deployment.yaml- Main Zabbix Server processproxy-deployment.yaml- Distributed proxy for agent collection
agent2-daemonset.yaml- Active agent, direct to serveragent2-daemonset-proxy.yaml- Active agent via proxyagentd-daemonset-proxy.yaml- Passive agent via proxy
web-nginx-pgsql.yaml- Nginx + PHP-FPM web frontendroute-web.yaml- OpenShift Route for external access
-
OpenShift Local (CRC) 2.20+
# Download from: https://developers.redhat.com/products/openshift-local/overview crc setup crc start --cpus 4 --memory 16384 -
Helm 3.x
# macOS brew install helm # Linux curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
-
kubectl / oc CLI
# macOS brew install openshift-cli # Login to CRC eval $(crc oc-env) oc login -u developer https://api.crc.testing:6443
- OpenShift Cluster 4.10+
- Cluster Admin Access (for namespace creation, SCCs)
- Helm 3.x installed
- oc CLI configured with cluster credentials
- Storage Class for persistent volumes (RWO)
| Component | CPU Request | Memory Request | Storage |
|---|---|---|---|
| Server | 500m | 512Mi | - |
| Proxy | 250m | 256Mi | - |
| Agent2 (per node) | 100m | 128Mi | - |
| PostgreSQL | 500m | 512Mi | 5Gi |
| Web UI | 250m | 256Mi | - |
Minimum Cluster: 2 vCPU, 8GB RAM, 10GB storage
Recommended: 4 vCPU, 16GB RAM, 20GB storage
# Start OpenShift Local
crc start --cpus 4 --memory 16384
# Configure oc environment
eval $(crc oc-env)
# Login as developer
oc login -u developer https://api.crc.testing:6443# Create namespace for Zabbix
oc new-project zabbix-monitoring
# Or use existing namespace
oc project zabbix-monitoringgit clone https://github.com/marsunin/zabbix-helm-poc.git
cd zabbix-helm-poc# Install Zabbix 7.4.x (latest)
helm install zabbix-7-4 manifests \
--create-namespace \
--set zabbixVersion=7.4.4-alpine \
--namespace zabbix-monitoring
# Or install Zabbix 6.0.x (legacy)
helm install zabbix-6-0 manifests \
--create-namespace \
--set zabbixVersion=6.0.42-alpine \
--namespace zabbix-monitoring# Get the route URL
oc get route -n zabbix-monitoring
# Open in browser
# Example: http://zabbix-web-zabbix-monitoring.apps-crc.testing
# Default credentials:
# Username: Admin
# Password: zabbix# Check all pods are running
oc get pods -n zabbix-monitoring
# Expected output:
# NAME READY STATUS
# zabbix-7-4-agent2-xxxxx 1/1 Running
# zabbix-7-4-agent2-proxy-xxxxx 1/1 Running
# zabbix-7-4-agentd-proxy-xxxxx 1/1 Running
# zabbix-7-4-postgres-0 1/1 Running
# zabbix-7-4-proxy-xxxxx 1/1 Running
# zabbix-7-4-server-xxxxx 1/1 Running
# zabbix-7-4-web-xxxxx 1/1 Running
# Check hosts are registered
oc logs -n zabbix-monitoring job/zabbix-7-4-bootstrap# Login with token or credentials
oc login --token=<your-token> --server=https://api.cluster.example.com:6443
# Or with username/password
oc login https://api.cluster.example.com:6443 -u admin# Create namespace
oc new-project zabbix-production
# Set resource quotas (optional)
cat <<EOF | oc apply -f -
apiVersion: v1
kind: ResourceQuota
metadata:
name: zabbix-quota
namespace: zabbix-production
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
persistentvolumeclaims: "5"
EOFCreate a custom values-production.yaml:
# values-production.yaml
zabbixVersion: "7.4.4-alpine"
replicaCount: 2 # High availability for web UI
postgresql:
persistence:
enabled: true
size: 50Gi # Larger storage for production
storageClass: "fast-ssd" # Your storage class
auth:
postgresPassword: "<strong-password>"
password: "<strong-password>"
username: "zabbix"
database: "zabbix"
# Production security
openshift:
runAsNonRoot: true
# Node selection (optional)
nodeSelector:
node-role.kubernetes.io/worker: ""
tolerations:
- key: "monitoring"
operator: "Equal"
value: "zabbix"
effect: "NoSchedule"# Install with production values
helm install zabbix-prod manifests \
--create-namespace \
--namespace zabbix-production \
--values values-production.yaml \
--wait \
--timeout 10m
# Watch deployment
watch oc get pods -n zabbix-production# Create edge-terminated route with TLS
oc create route edge zabbix-web-secure \
--service=zabbix-prod-web \
--port=8080 \
--hostname=zabbix.example.com \
--namespace zabbix-production# Check all components
oc get all -n zabbix-production
# Verify TLS certificates
oc get secret -n zabbix-production | grep tls
# Check persistent volumes
oc get pvc -n zabbix-production
# View bootstrap logs
oc logs -n zabbix-production job/zabbix-prod-bootstrap
# Test database connection
oc exec -it zabbix-prod-postgres-0 -n zabbix-production -- \
psql -U zabbix -d zabbix -c "SELECT COUNT(*) FROM hosts;"# Zabbix version - determines API compatibility and bootstrap job
zabbixVersion: "7.4.4-alpine" # or "6.0.42-alpine"The chart automatically:
- Strips
-alpinesuffix for version comparison - Deploys correct bootstrap job (modern vs legacy API)
- Uses matching container image tags
image:
server:
repository: zabbix/zabbix-server-pgsql
pullPolicy: IfNotPresent
proxy:
repository: zabbix/zabbix-proxy-sqlite3
pullPolicy: IfNotPresent
agent2:
repository: zabbix/zabbix-agent2
pullPolicy: IfNotPresent
agentd:
repository: zabbix/zabbix-agent
pullPolicy: IfNotPresent
web:
repository: zabbix/zabbix-web-nginx-pgsql
pullPolicy: IfNotPresentpostgresql:
enabled: true
image:
repository: bitnami/postgresql
tag: "latest"
pullPolicy: IfNotPresent
auth:
postgresPassword: zabbix # Change in production!
username: zabbix
password: zabbix # Change in production!
database: zabbix
persistence:
enabled: true
size: 5Gi # Adjust for production
storageClass: "" # Use default or specify
service:
port: 5432service:
type: ClusterIP
agent2Port: 10050 # Agent listening port
agentdPort: 10050 # AgentD listening port
serverPort: 10051 # Server listening port
proxyPort: 10051 # Proxy listening port
webPort: 8080 # Web UI porttls:
enabled: true # Enable mTLS for all components
# Certificates are auto-generated by certgen job
# Stored in Secret: <release-name>-tlsopenshift:
runAsUser: null # Let OpenShift assign UID
fsGroup: null # Let OpenShift assign GID
runAsNonRoot: true # Enforce non-root containersbootstrap:
# Node list for host registration
# If not provided, uses all cluster nodes
nodes: "crc,worker-1,worker-2"
# API credentials (default in Zabbix)
username: "Admin"
password: "zabbix"
# Proxy name (auto-generated if not set)
proxyName: "zabbix-proxy"helm install zabbix manifests \
--create-namespace \
--set zabbixVersion=7.4.4-alpine \
--set postgresql.persistence.size=2Gihelm install zabbix manifests \
--create-namespace \
--set zabbixVersion=7.4.4-alpine \
--set postgresql.persistence.size=10Gi \
--set postgresql.persistence.storageClass=standardhelm install zabbix manifests \
--create-namespace \
--set zabbixVersion=7.4.4-alpine \
--set replicaCount=2 \
--set postgresql.persistence.size=100Gi \
--set postgresql.persistence.storageClass=fast-ssd \
--set postgresql.auth.postgresPassword="$(openssl rand -base64 32)" \
--set postgresql.auth.password="$(openssl rand -base64 32)"| Version | Bootstrap Job | API Auth | Status |
|---|---|---|---|
| 6.0.x | Legacy | user + body auth |
✅ Tested |
| 7.0.x | Modern | username + Bearer |
|
| 7.4.x | Modern | username + Bearer |
✅ Tested |
The chart includes two bootstrap jobs that are conditionally deployed based on version:
- Authentication: Bearer token in header
- Login parameter:
username - Proxy creation:
name+operating_mode - Host assignment:
monitored_by+proxyid
- Authentication:
authtoken in request body - Login parameter:
user - Proxy creation:
host+status - Host assignment:
proxy_hostid
See BOOTSTRAP-DIFFERENCES.md for detailed API comparison.
# From Zabbix 6.0 to 7.4
helm upgrade zabbix-6-0 manifests \
--set zabbixVersion=7.4.4-alpine \
--namespace zabbix-monitoring
# Chart automatically uses correct bootstrap jobSymptom: Pods stuck in Pending or CrashLoopBackOff
# Check pod status
oc get pods -n zabbix-monitoring
# Describe pod for events
oc describe pod <pod-name> -n zabbix-monitoring
# Check logs
oc logs <pod-name> -n zabbix-monitoringCommon causes:
- Insufficient resources (CPU/memory)
- PVC not bound (check
oc get pvc) - Image pull errors (check image names)
Symptom: Agents can't connect, "TLS handshake failed" errors
# Check if certificates were generated
oc get secret -n zabbix-monitoring | grep tls
# View certificate job logs
oc logs job/zabbix-7-4-certgen -n zabbix-monitoring
# Verify certificates are mounted
oc exec -it <pod-name> -n zabbix-monitoring -- ls -la /etc/zabbix/tls/Solution: Delete and recreate to regenerate certificates
oc delete secret zabbix-7-4-tls -n zabbix-monitoring
helm upgrade zabbix-7-4 manifests -n zabbix-monitoringSymptom: Hosts/proxy not registered in Zabbix UI
# Check bootstrap job logs
oc logs job/zabbix-7-4-bootstrap -n zabbix-monitoring
# Common issues:
# - "Not authorized" → Wrong API version/credentials
# - "Connection refused" → Server not ready yet
# - "Invalid params" → API version mismatchSolution for version mismatch:
# Verify correct bootstrap job deployed
helm template manifests --set zabbixVersion=7.4.4-alpine | grep "kind: Job"
# Should see only one bootstrap jobSymptom: Server logs show "Cannot connect to database"
# Check PostgreSQL is running
oc get pods -n zabbix-monitoring | grep postgres
# Test database connection
oc exec -it zabbix-7-4-postgres-0 -n zabbix-monitoring -- \
psql -U zabbix -d zabbix -c "SELECT version();"
# Check server can resolve postgres service
oc exec -it <server-pod> -n zabbix-monitoring -- \
nslookup zabbix-7-4-postgresSymptom: Route exists but returns 502/503
# Check route
oc get route -n zabbix-monitoring
# Check web pod is running
oc get pods -n zabbix-monitoring | grep web
# Check web pod logs
oc logs <web-pod-name> -n zabbix-monitoring
# Test internal connectivity
oc exec -it <web-pod> -n zabbix-monitoring -- \
curl -I localhost:8080Enable verbose logging:
# Server debug logs
oc set env deployment/zabbix-7-4-server \
ZBX_DEBUGLEVEL=4 \
-n zabbix-monitoring
# View logs
oc logs -f deployment/zabbix-7-4-server -n zabbix-monitoring# Complete uninstall
helm uninstall zabbix-7-4 -n zabbix-monitoring
# Delete PVCs (data will be lost!)
oc delete pvc -l app=zabbix-postgres -n zabbix-monitoring
# Delete TLS secret
oc delete secret zabbix-7-4-tls -n zabbix-monitoring
# Reinstall
helm install zabbix-7-4 manifests \
--create-namespace \
--set zabbixVersion=7.4.4-alpine \
--namespace zabbix-monitoring# Lint Helm chart
helm lint manifests
# Template rendering (without installing)
helm template test manifests --set zabbixVersion=7.4.4-alpine
# Dry-run installation
helm install zabbix-test manifests \
--create-namespace \
--dry-run \
--debug \
--set zabbixVersion=7.4.4-alpine# After editing templates, validate syntax
helm template manifests | oc apply --dry-run=client -f -
# Test with different versions
helm template manifests --set zabbixVersion=6.0.42-alpine | grep bootstrap
helm template manifests --set zabbixVersion=7.4.4-alpine | grep bootstrapTo add a fourth agent type (e.g., SNMP traps):
- Create new DaemonSet:
templates/agent-snmp-daemonset.yaml - Create service:
templates/agent-snmp-service.yaml - Add to
values.yaml:image: agentSnmp: repository: zabbix/zabbix-agent-snmp pullPolicy: IfNotPresent
- Update bootstrap job to register new hosts
- Test deployment
- Fork repository
- Create feature branch:
git checkout -b feature/my-feature - Make changes and test thoroughly
- Commit with descriptive message
- Push and create pull request
This project is provided as-is for educational and proof-of-concept purposes.
For issues and questions:
- Check Troubleshooting section
- Review BOOTSTRAP-DIFFERENCES.md for API details
- Open an issue in the repository
- Zabbix LLC for the monitoring platform
- Red Hat OpenShift team
- Helm community
- Bitnami for container images
Last Updated: November 2025
Chart Version: 0.1.0
Tested On: OpenShift Local (CRC) 2.20, OpenShift 4.14