Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ollama support #8

Merged
merged 3 commits into from
Nov 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added
- Add support for prompt templates with `AITemplate` struct. Search for suitable templates with `aitemplates("query string")` and then simply use them with `aigenerate(AITemplate(:TemplateABC); variableX = "some value") -> AIMessage` or use a dispatch on the template name as a `Symbol`, eg, `aigenerate(:TemplateABC; variableX = "some value") -> AIMessage`. Templates are saved as JSON files in the folder `templates/`. If you add new templates, you can reload them with `load_templates!()` (notice the exclamation mark to override the existing `TEMPLATE_STORE`).
- Add `aiextract` function to extract structured information from text quickly and easily. See `?aiextract` for more information.
- Add `aiscan` for image scanning (ie, image comprehension tasks). You can transcribe screenshots or reason over images as if they were text. Images can be provided either as a local file (`image_path`) or as an url (`image_url`). See `?aiscan` for more information.
- Add `aiscan` for image scanning (ie, image comprehension tasks). You can transcribe screenshots or reason over images as if they were text. Images can be provided either as a local file (`image_path`) or as an url (`image_url`). See `?aiscan` for more information.
- Add support for [Ollama.ai](https://ollama.ai/)'s local models. Only `aigenerate` and `aiembed` functions are supported at the moment.
74 changes: 70 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,13 @@ Getting started with PromptingTools.jl is as easy as importing the package and u
Note: You will need to set your OpenAI API key as an environment variable before using PromptingTools.jl (see the [Creating OpenAI API Key](#creating-openai-api-key) section below).
For a quick start, simply set it via `ENV["OPENAI_API_KEY"] = "your-api-key"`

Install PromptingTools.jl:
```julia
using Pkg
Pkg.add("https://github.com/svilupp/PromptingTools.jl")
```

And we're ready to go!
```julia
using PromptingTools

Expand Down Expand Up @@ -68,6 +75,7 @@ For more practical examples, see the `examples/` folder and the [Advanced Exampl
- [Classification](#classification)
- [Data Extraction](#data-extraction)
- [OCR and Image Comprehension](#ocr-and-image-comprehension)
- [Using Ollama models](#using-ollama-models)
- [More Examples](#more-examples)
- [Package Interface](#package-interface)
- [Frequently Asked Questions](#frequently-asked-questions)
Expand All @@ -79,6 +87,8 @@ For more practical examples, see the `examples/` folder and the [Advanced Exampl
- [Configuring the Environment Variable for API Key](#configuring-the-environment-variable-for-api-key)
- [Understanding the API Keyword Arguments in `aigenerate` (`api_kwargs`)](#understanding-the-api-keyword-arguments-in-aigenerate-api_kwargs)
- [Instant Access from Anywhere](#instant-access-from-anywhere)
- [Open Source Alternatives](#open-source-alternatives)
- [Setup Guide for Ollama](#setup-guide-for-ollama)
- [Roadmap](#roadmap)

## Why PromptingTools.jl
Expand Down Expand Up @@ -352,6 +362,35 @@ using Markdown
msg.content |> Markdown.parse
```

## Using Ollama models

[Ollama.ai](https://ollama.ai/) is an amazingly simple tool that allows you to run several Large Language Models (LLM) on your computer. It's especially suitable when you're working with some sensitive data that should not be sent anywhere.

Let's assume you have installed Ollama, downloaded a model, and it's running in the background.

We can use it with the `aigenerate` function:

```julia
const PT = PromptingTools
schema = PT.OllamaManagedSchema() # notice the different schema!

msg = aigenerate(schema, "Say hi!"; model="openhermes2.5-mistral")
# [ Info: Tokens: 69 in 0.9 seconds
# AIMessage("Hello! How can I assist you today?")
```

And we can also use the `aiembed` function:

```julia
msg = aiembed(schema, "Embed me", copy; model="openhermes2.5-mistral")
msg.content # 4096-element JSON3.Array{Float64...

msg = aiembed(schema, ["Embed me", "Embed me"]; model="openhermes2.5-mistral")
msg.content # 4096×2 Matrix{Float64}:
```

If you're getting errors, check that Ollama is running - see the [Setup Guide for Ollama](#setup-guide-for-ollama) section below.

### More Examples

TBU...
Expand Down Expand Up @@ -419,9 +458,9 @@ Each new interface would be defined in a separate `llm_<interface>.jl` file.

OpenAI's models are at the forefront of AI research and provide robust, state-of-the-art capabilities for many tasks.

There will be reasons when you do not or cannot use it (eg, privacy, cost, etc.). In that case, you can use local models (eg, Ollama) or other APIs (eg, Anthropic).
There will be situations not or cannot use it (eg, privacy, cost, etc.). In that case, you can use local models (eg, Ollama) or other APIs (eg, Anthropic).

Note: Tutorial for how to set up and use Ollama + PromptingTools.jl is coming!
Note: To get started with [Ollama.ai](https://ollama.ai/), see the [Setup Guide for Ollama](#setup-guide-for-ollama) section below.

### Data Privacy and OpenAI

Expand All @@ -431,7 +470,10 @@ At the time of writing, OpenAI does NOT use the API calls for training their mod
>
> OpenAI does not use data submitted to and generated by our API to train OpenAI models or improve OpenAI’s service offering. In order to support the continuous improvement of our models, you can fill out this form to opt-in to share your data with us. -- [How your data is used to improve our models](https://help.openai.com/en/articles/5722486-how-your-data-is-used-to-improve-model-performance)

You can always double-check the latest information on the [OpenAI's How we use your data](https://platform.openai.com/docs/models/how-we-use-your-data) page.

Resources:
- [OpenAI's How we use your data](https://platform.openai.com/docs/models/how-we-use-your-data)
- [Data usage for consumer services FAQ](https://help.openai.com/en/articles/7039943-data-usage-for-consumer-services-faq)
- [How your data is used to improve our models](https://help.openai.com/en/articles/5722486-how-your-data-is-used-to-improve-model-performance)

Expand Down Expand Up @@ -520,16 +562,40 @@ const PT = PromptingTools # to access unexported functions and types

Now, you can just use `ai"Help me do X to achieve Y"` from any REPL session!

### Open Source Alternatives

The ethos of PromptingTools.jl is to allow you to use whatever model you want, which includes Open Source LLMs. The most popular and easiest to setup is [Ollama.ai](https://ollama.ai/) - see below for more information.

### Setup Guide for Ollama

Ollama runs a background service hosting LLMs that you can access via a simple API. It's especially useful when you're working with some sensitive data that should not be sent anywhere.

Installation is very easy, just download the latest version [here](https://ollama.ai/download).

Once you've installed it, just launch the app and you're ready to go!

To check if it's running, go to your browser and open `127.0.0.1:11434`. You should see the message "Ollama is running".
Alternatively, you can run `ollama serve` in your terminal and you'll get a message that it's already running.

There are many models available in [Ollama Library](https://ollama.ai/library), including Llama2, CodeLlama, SQLCoder, or my personal favorite `openhermes2.5-mistral`.

Download new models with `ollama pull <model_name>` (eg, `ollama pull openhermes2.5-mistral`).

Show currently available models with `ollama list`.

See [Ollama.ai](https://ollama.ai/) for more information.

## Roadmap

This is a list of features that I'd like to see in the future (in no particular order):
- Document more mini-tasks, add tutorials
- Integration of new OpenAI capabilities (eg, vision, audio, assistants -> Imagine a function you send a Plot to and it will add code to add titles, labels, etc. and generate insights for your report!)
- Documented support for local models (eg, guide and prompt templates for Ollama)
- Integration of new OpenAI capabilities (eg, audio, assistants -> Imagine a function you send a Plot to and it will add code to add titles, labels, etc. and generate insights for your report!)
- Add Preferences.jl mechanism to set defaults and persist them across sessions
- More templates for common tasks (eg, fact-checking, sentiment analysis, extraction of entities/metadata, etc.)
- Ability to easily add new templates, save them, and share them with others
- Ability to easily trace and serialize the prompts & AI results for finetuning or evaluation in the future
- Add multi-turn conversations if you need to "reply" to the AI assistant


For more information, contributions, or questions, please visit the [PromptingTools.jl GitHub repository](https://github.com/svilupp/PromptingTools.jl).

Expand Down
60 changes: 60 additions & 0 deletions examples/working_with_ollama.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
using PromptingTools
const PT = PromptingTools

# Notice the schema change! If you want this to be the new default, you need to change `PT.PROMPT_SCHEMA`
schema = PT.OllamaManagedSchema()
# You can choose models from https://ollama.ai/library - I prefer `openhermes2.5-mistral`
model = "openhermes2.5-mistral"

# # Text Generation with aigenerate

# ## Simple message
msg = aigenerate(schema, "Say hi!"; model)

# ## Standard string interpolation
a = 1
msg = aigenerate(schema, "What is `$a+$a`?"; model)

name = "John"
msg = aigenerate(schema, "Say hi to {{name}}."; name, model)

# ## Advanced Prompts
conversation = [
PT.SystemMessage("You're master Yoda from Star Wars trying to help the user become a Yedi."),
PT.UserMessage("I have feelings for my iPhone. What should I do?")]
msg = aigenerate(schema, conversation; model)

# # Embeddings with aiembed

# ## Simple embedding for one document
msg = aiembed(schema, "Embed me"; model)
msg.content

# One document and we materialize the data into a Vector with copy (`postprocess` function argument)
msg = aiembed(schema, "Embed me", copy; model)
msg.content

# ## Multiple documents embedding
# Multiple documents - embedded sequentially, you can get faster speed with async
msg = aiembed(schema, ["Embed me", "Embed me"]; model)
msg.content

# You can use Threads.@spawn or asyncmap, whichever you prefer, to paralellize the model calls
docs = ["Embed me", "Embed me"]
tasks = asyncmap(docs) do doc
msg = aiembed(schema, doc; model)
end
embedding = mapreduce(x -> x.content, hcat, tasks)

# ## Using postprocessing function
# Add normalization as postprocessing function to normalize embeddings on reception (for easy cosine similarity later)
using LinearAlgebra
schema = PT.OllamaManagedSchema()

msg = aiembed(schema,
["embed me", "and me too"],
LinearAlgebra.normalize;
model = "openhermes2.5-mistral")

# Cosine similarity is then a simple multiplication
msg.content' * msg.content[:, 1] # [1.0, 0.34]
1 change: 1 addition & 0 deletions src/PromptingTools.jl
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ include("extraction.jl")

## Individual interfaces
include("llm_openai.jl")
include("llm_ollama_managed.jl")

## Convenience utils
export @ai_str, @aai_str
Expand Down
11 changes: 10 additions & 1 deletion src/llm_interface.jl
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ It uses the following conversation structure:
struct ChatMLSchema <: AbstractChatMLSchema end

abstract type AbstractManagedSchema <: AbstractPromptSchema end
abstract type AbstractOllamaManagedSchema <: AbstractManagedSchema end

"""
Ollama by default manages different models and their associated prompt schemas when you pass `system_prompt` and `prompt` fields to the API.
Expand All @@ -67,7 +68,15 @@ Warning: It works only for 1 system message and 1 user message, so anything more
If you need to pass more messagese / longer conversational history, you can use define the model-specific schema directly and pass your Ollama requests with `raw=true`,
which disables and templating and schema management by Ollama.
"""
struct OllamaManagedSchema <: AbstractManagedSchema end
struct OllamaManagedSchema <: AbstractOllamaManagedSchema end

"Echoes the user's input back to them. Used for testing the implementation"
@kwdef mutable struct TestEchoOllamaManagedSchema <: AbstractOllamaManagedSchema
response::AbstractDict
status::Integer
model_id::String = ""
inputs::Any = nothing
end

## Dispatch into default schema
const PROMPT_SCHEMA = OpenAISchema()
Expand Down
Loading
Loading