Ollama¶
Configure HolmesGPT to use local models with Ollama.
Warning
Ollama is supported, but buggy. We recommend using other models if you can, until Ollama tool-calling capabilities improve. Specifically, Ollama often calls tools with non-existent or missing parameters.
Setup¶
- Download Ollama from ollama.com
- Install following the instructions for your operating system
- Start Ollama service:
Download Models¶
Configuration¶
Method 1: Ollama Native Format¶
export OLLAMA_API_BASE="http://localhost:11434"
holmes ask "what pods are unhealthy in my cluster?" --model="ollama_chat/llama3.1"
Method 2: OpenAI-Compatible Format¶
# Note the v1 at the end
export OPENAI_API_BASE="http://localhost:11434/v1"
# Holmes requires OPENAI_API_KEY to be set but value does not matter
export OPENAI_API_KEY=123
holmes ask "what pods are unhealthy in my cluster?" --model="openai/llama3.1"
Model Usage¶
# Using different models
holmes ask "pod analysis" --model="ollama_chat/llama3.1:8b"
holmes ask "yaml debugging" --model="ollama_chat/codellama:7b"
holmes ask "quick check" --model="ollama_chat/mistral:7b"
Known Limitations¶
Current problems with Ollama:
- Missing parameters - Tools called without required arguments
- Invalid parameters - Non-existent parameter names
- Inconsistent behavior - Results may vary between runs
- Limited function following - May not follow tool schemas correctly
Troubleshooting¶
Ollama Not Running
- Start Ollama service:ollama serve
- Check if port 11434 is available
- Verify firewall settings
Model Not Found
- Pull the model:ollama pull model-name
- List available models: ollama list
- Check model name spelling
Memory Issues
- Choose a smaller model - Close other applications - Add more RAM or use GPU accelerationSlow Performance
- Use smaller models (7B instead of 13B/70B) - Enable GPU acceleration - Increase CPU allocation:export OLLAMA_NUM_THREAD=8