Ollama enables easy deployment of large language models on your own infrastructure.
See the models available in the Ollama library.
juju deploy ollama --channel=beta
juju run ollama/0 pull model="llama3.1"
Note: the pull action may take a long time, you can add the --wait
parameter (i.e. --wait=5m
) to avoid getting a timeout error if your model is too large.
juju run ollama/0 generate model="llama3.1" prompt="Why is the sky blue?"
Note: the generate action may take a long time, you can add the --wait
parameter (i.e. --wait=5m
) to avoid getting a timeout error if your hardware is too slow.
curl http://<unit-ip-address>:11434/api/generate -d '{
"model": "llama3.1",
"prompt":"Why is the sky blue?"
}'
- See the Juju SDK documentation for more information about developing and improving charms.
- See the Ollama API documentation for all the interactions you can have with Ollama.