Prometheus¶
Connect HolmesGPT to Prometheus for metrics analysis and query generation. This integration enables detection of memory leaks, CPU throttling, queue backlogs, and performance issues.
Prerequisites¶
- A running and accessible Prometheus server
- Ensure HolmesGPT can connect to the Prometheus endpoint
Configuration¶
Add the following to ~/.holmes/config.yaml. Create the file if it doesn't exist:
toolsets:
prometheus/metrics:
enabled: true
config:
prometheus_url: http://prometheus-server:9090 # localhost:9090 if port-forwarded
# Optional:
#headers:
# Authorization: "Basic <base_64_encoded_string>"
After making changes to your configuration, run:
💡 Alternative: Set the PROMETHEUS_URL
environment variable instead of using the config file.
Validation¶
To test your connection, run:
Troubleshooting¶
Finding your Prometheus URL¶
There are several ways to find your Prometheus URL:
Option 1: Simple method (port-forwarding)
# Find Prometheus services
kubectl get svc -A | grep prometheus
# Port forward for testing
kubectl port-forward svc/<your-prometheus-service> 9090:9090 -n <namespace>
# Then access Prometheus at: http://localhost:9090
Option 2: Advanced method (get full cluster DNS URL)
If you want to find the full internal DNS URL for Prometheus, run:
kubectl get svc --all-namespaces -o jsonpath='{range .items[*]}{.metadata.name}{"."}{.metadata.namespace}{".svc.cluster.local:"}{.spec.ports[0].port}{"\n"}{end}' | grep prometheus | grep -Ev 'operat|alertmanager|node|coredns|kubelet|kube-scheduler|etcd|controller' | awk '{print "http://"$1}'
This will print all possible Prometheus service URLs in your cluster. Pick the one that matches your deployment.
Common Issues¶
- Connection refused: Check if the Prometheus URL is accessible from HolmesGPT.
- Authentication errors: Verify the headers configuration for secured Prometheus endpoints.
- No metrics returned: Ensure that Prometheus is scraping your targets.
Advanced Configuration¶
You can further customize the Prometheus toolset with the following options:
toolsets:
prometheus/metrics:
enabled: true
config:
prometheus_url: http://<prometheus-host>:9090
healthcheck: "-/healthy" # Path for health checking (default: -/healthy)
headers:
Authorization: "Basic <base_64_encoded_string>"
metrics_labels_time_window_hrs: 48 # Time window (hours) for fetching labels (default: 48)
metrics_labels_cache_duration_hrs: 12 # How long to cache labels (hours, default: 12)
fetch_labels_with_labels_api: false # Use labels API instead of series API (default: false)
fetch_metadata_with_series_api: false # Use series API for metadata (default: false)
tool_calls_return_data: true # If false, disables returning Prometheus data (default: true)
Config option explanations:
prometheus_url
: The base URL for Prometheus. Should include protocol and port.healthcheck
: Path used for health checking Prometheus or Mimir/Cortex endpoint. Defaults to-/healthy
for Prometheus, use/ready
for Grafana Mimir.headers
: Extra headers for all Prometheus HTTP requests (e.g., for authentication).metrics_labels_time_window_hrs
: Time window (in hours) for fetching labels. Set tonull
to fetch all labels.metrics_labels_cache_duration_hrs
: How long to cache labels (in hours). Set tonull
to disable caching.fetch_labels_with_labels_api
: Use the Prometheus labels API to fetch labels (can improve performance, but increases HTTP calls).fetch_metadata_with_series_api
: Use the series API for metadata (only set to true if the metadata API is disabled or not working).tool_calls_return_data
: Iffalse
, disables returning Prometheus data to HolmesGPT (useful if you hit token limits).
Coralogix Prometheus Configuration¶
To use a Coralogix PromQL endpoint with HolmesGPT:
- Go to Coralogix Documentation and choose the relevant PromQL endpoint for your region.
- In Coralogix, create an API key with permissions to query metrics (Data Flow → API Keys).
-
Create a Kubernetes secret for the API key and expose it as an environment variable in your Helm values:
-
Add the following under your toolsets in the Helm chart:
holmes: toolsets: prometheus/metrics: enabled: true config: healthcheck: "/api/v1/query?query=up" # This is important for Coralogix prometheus_url: "https://prom-api.eu2.coralogix.com" # Use your region's endpoint headers: token: "{{ env.CORALOGIX_API_KEY }}" metrics_labels_time_window_hrs: 72 metrics_labels_cache_duration_hrs: 12 fetch_labels_with_labels_api: true tool_calls_return_data: true fetch_metadata_with_series_api: true
Capabilities¶
Tool Name | Description |
---|---|
list_available_metrics | List all available Prometheus metrics |
execute_prometheus_instant_query | Execute an instant PromQL query |
execute_prometheus_range_query | Execute a range PromQL query for time series data |
get_current_time | Get current timestamp for time-based queries |