svilupp · svilupp · Nov 24, 2023 · Nov 23, 2023 · Nov 24, 2023 · Nov 24, 2023
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,4 +8,5 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ### Added
 - Add support for prompt templates with `AITemplate` struct. Search for suitable templates with `aitemplates("query string")` and then simply use them with `aigenerate(AITemplate(:TemplateABC); variableX = "some value") -> AIMessage` or use a dispatch on the template name as a `Symbol`, eg, `aigenerate(:TemplateABC; variableX = "some value") -> AIMessage`. Templates are saved as JSON files in the folder `templates/`. If you add new templates, you can reload them with `load_templates!()` (notice the exclamation mark to override the existing `TEMPLATE_STORE`).
 - Add `aiextract` function to extract structured information from text quickly and easily. See `?aiextract` for more information.
-- Add `aiscan` for image scanning (ie, image comprehension tasks). You can transcribe screenshots or reason over images as if they were text. Images can be provided either as a local file (`image_path`) or as an url (`image_url`). See `?aiscan` for more information.
+- Add `aiscan` for image scanning (ie, image comprehension tasks). You can transcribe screenshots or reason over images as if they were text. Images can be provided either as a local file (`image_path`) or as an url (`image_url`). See `?aiscan` for more information.
+- Add support for [Ollama.ai](https://ollama.ai/)'s local models. Only `aigenerate` and `aiembed` functions are supported at the moment.
diff --git a/README.md b/README.md
@@ -18,6 +18,13 @@ Getting started with PromptingTools.jl is as easy as importing the package and u
 Note: You will need to set your OpenAI API key as an environment variable before using PromptingTools.jl (see the [Creating OpenAI API Key](#creating-openai-api-key) section below). 
 For a quick start, simply set it via `ENV["OPENAI_API_KEY"] = "your-api-key"`
 
+Install PromptingTools.jl:
+```julia
+using Pkg
+Pkg.add("https://github.com/svilupp/PromptingTools.jl")
+```
+
+And we're ready to go!
 ```julia
 using PromptingTools
 
@@ -68,6 +75,7 @@ For more practical examples, see the `examples/` folder and the [Advanced Exampl
     - [Classification](#classification)
     - [Data Extraction](#data-extraction)
     - [OCR and Image Comprehension](#ocr-and-image-comprehension)
+  - [Using Ollama models](#using-ollama-models)
     - [More Examples](#more-examples)
   - [Package Interface](#package-interface)
   - [Frequently Asked Questions](#frequently-asked-questions)
@@ -79,6 +87,8 @@ For more practical examples, see the `examples/` folder and the [Advanced Exampl
     - [Configuring the Environment Variable for API Key](#configuring-the-environment-variable-for-api-key)
     - [Understanding the API Keyword Arguments in `aigenerate` (`api_kwargs`)](#understanding-the-api-keyword-arguments-in-aigenerate-api_kwargs)
     - [Instant Access from Anywhere](#instant-access-from-anywhere)
+    - [Open Source Alternatives](#open-source-alternatives)
+    - [Setup Guide for Ollama](#setup-guide-for-ollama)
   - [Roadmap](#roadmap)
 
 ## Why PromptingTools.jl
@@ -352,6 +362,35 @@ using Markdown
 msg.content |> Markdown.parse
 ```
 
+## Using Ollama models
+
+[Ollama.ai](https://ollama.ai/) is an amazingly simple tool that allows you to run several Large Language Models (LLM) on your computer. It's especially suitable when you're working with some sensitive data that should not be sent anywhere.
+
+Let's assume you have installed Ollama, downloaded a model, and it's running in the background.
+
+We can use it with the `aigenerate` function:
+
+```julia
+const PT = PromptingTools
+schema = PT.OllamaManagedSchema() # notice the different schema!
+
+msg = aigenerate(schema, "Say hi!"; model="openhermes2.5-mistral")
+# [ Info: Tokens: 69 in 0.9 seconds
+# AIMessage("Hello! How can I assist you today?")
+```
+
+And we can also use the `aiembed` function:
+
+```julia
+msg = aiembed(schema, "Embed me", copy; model="openhermes2.5-mistral")
+msg.content # 4096-element JSON3.Array{Float64...
+
+msg = aiembed(schema, ["Embed me", "Embed me"]; model="openhermes2.5-mistral")
+msg.content # 4096×2 Matrix{Float64}:
+```
+
+If you're getting errors, check that Ollama is running - see the [Setup Guide for Ollama](#setup-guide-for-ollama) section below.
+
 ### More Examples
 
 TBU...
@@ -419,9 +458,9 @@ Each new interface would be defined in a separate `llm_<interface>.jl` file.
 
 OpenAI's models are at the forefront of AI research and provide robust, state-of-the-art capabilities for many tasks.
 
-There will be reasons when you do not or cannot use it (eg, privacy, cost, etc.). In that case, you can use local models (eg, Ollama) or other APIs (eg, Anthropic).
+There will be situations not or cannot use it (eg, privacy, cost, etc.). In that case, you can use local models (eg, Ollama) or other APIs (eg, Anthropic).
 
-Note: Tutorial for how to set up and use Ollama + PromptingTools.jl is coming!
+Note: To get started with [Ollama.ai](https://ollama.ai/), see the [Setup Guide for Ollama](#setup-guide-for-ollama) section below.
 
 ### Data Privacy and OpenAI
 
@@ -431,7 +470,10 @@ At the time of writing, OpenAI does NOT use the API calls for training their mod
 > 
 > OpenAI does not use data submitted to and generated by our API to train OpenAI models or improve OpenAI’s service offering. In order to support the continuous improvement of our models, you can fill out this form to opt-in to share your data with us. -- [How your data is used to improve our models](https://help.openai.com/en/articles/5722486-how-your-data-is-used-to-improve-model-performance)
 
+You can always double-check the latest information on the [OpenAI's How we use your data](https://platform.openai.com/docs/models/how-we-use-your-data) page.
+
 Resources:
+- [OpenAI's How we use your data](https://platform.openai.com/docs/models/how-we-use-your-data)
 - [Data usage for consumer services FAQ](https://help.openai.com/en/articles/7039943-data-usage-for-consumer-services-faq)
 - [How your data is used to improve our models](https://help.openai.com/en/articles/5722486-how-your-data-is-used-to-improve-model-performance)
 
@@ -520,16 +562,40 @@ const PT = PromptingTools # to access unexported functions and types
 
 Now, you can just use `ai"Help me do X to achieve Y"` from any REPL session!
 
+### Open Source Alternatives
+
+The ethos of PromptingTools.jl is to allow you to use whatever model you want, which includes Open Source LLMs. The most popular and easiest to setup is [Ollama.ai](https://ollama.ai/) - see below for more information.
+
+### Setup Guide for Ollama
+
+Ollama runs a background service hosting LLMs that you can access via a simple API. It's especially useful when you're working with some sensitive data that should not be sent anywhere.
+
+Installation is very easy, just download the latest version [here](https://ollama.ai/download).
+
+Once you've installed it, just launch the app and you're ready to go!
+
+To check if it's running, go to your browser and open `127.0.0.1:11434`. You should see the message "Ollama is running". 
+Alternatively, you can run `ollama serve` in your terminal and you'll get a message that it's already running.
+
+There are many models available in [Ollama Library](https://ollama.ai/library), including Llama2, CodeLlama, SQLCoder, or my personal favorite `openhermes2.5-mistral`.
+
+Download new models with `ollama pull <model_name>` (eg, `ollama pull openhermes2.5-mistral`). 
+
+Show currently available models with `ollama list`.
+
+See [Ollama.ai](https://ollama.ai/) for more information.
+
 ## Roadmap
 
 This is a list of features that I'd like to see in the future (in no particular order):
 - Document more mini-tasks, add tutorials
-- Integration of new OpenAI capabilities (eg, vision, audio, assistants -> Imagine a function you send a Plot to and it will add code to add titles, labels, etc. and generate insights for your report!)
-- Documented support for local models (eg, guide and prompt templates for Ollama)
+- Integration of new OpenAI capabilities (eg, audio, assistants -> Imagine a function you send a Plot to and it will add code to add titles, labels, etc. and generate insights for your report!)
 - Add Preferences.jl mechanism to set defaults and persist them across sessions
 - More templates for common tasks (eg, fact-checking, sentiment analysis, extraction of entities/metadata, etc.)
 - Ability to easily add new templates, save them, and share them with others
 - Ability to easily trace and serialize the prompts & AI results for finetuning or evaluation in the future
+- Add multi-turn conversations if you need to "reply" to the AI assistant
+
 
 For more information, contributions, or questions, please visit the [PromptingTools.jl GitHub repository](https://github.com/svilupp/PromptingTools.jl).
 

diff --git a/examples/working_with_ollama.jl b/examples/working_with_ollama.jl
@@ -0,0 +1,60 @@
+using PromptingTools
+const PT = PromptingTools
+
+# Notice the schema change! If you want this to be the new default, you need to change `PT.PROMPT_SCHEMA`
+schema = PT.OllamaManagedSchema()
+# You can choose models from https://ollama.ai/library - I prefer `openhermes2.5-mistral`
+model = "openhermes2.5-mistral"
+
+# # Text Generation with aigenerate
+
+# ## Simple message
+msg = aigenerate(schema, "Say hi!"; model)
+
+# ## Standard string interpolation
+a = 1
+msg = aigenerate(schema, "What is `$a+$a`?"; model)
+
+name = "John"
+msg = aigenerate(schema, "Say hi to {{name}}."; name, model)
+
+# ## Advanced Prompts
+conversation = [
+    PT.SystemMessage("You're master Yoda from Star Wars trying to help the user become a Yedi."),
+    PT.UserMessage("I have feelings for my iPhone. What should I do?")]
+msg = aigenerate(schema, conversation; model)
+
+# # Embeddings with aiembed
+
+# ## Simple embedding for one document
+msg = aiembed(schema, "Embed me"; model)
+msg.content
+
+# One document and we materialize the data into a Vector with copy (`postprocess` function argument)
+msg = aiembed(schema, "Embed me", copy; model)
+msg.content
+
+# ## Multiple documents embedding
+# Multiple documents - embedded sequentially, you can get faster speed with async
+msg = aiembed(schema, ["Embed me", "Embed me"]; model)
+msg.content
+
+# You can use Threads.@spawn or asyncmap, whichever you prefer, to paralellize the model calls
+docs = ["Embed me", "Embed me"]
+tasks = asyncmap(docs) do doc
+    msg = aiembed(schema, doc; model)
+end
+embedding = mapreduce(x -> x.content, hcat, tasks)
+
+# ## Using postprocessing function
+# Add normalization as postprocessing function to normalize embeddings on reception (for easy cosine similarity later)
+using LinearAlgebra
+schema = PT.OllamaManagedSchema()
+
+msg = aiembed(schema,
+    ["embed me", "and me too"],
+    LinearAlgebra.normalize;
+    model = "openhermes2.5-mistral")
+
+# Cosine similarity is then a simple multiplication
+msg.content' * msg.content[:, 1] # [1.0, 0.34]
diff --git a/src/PromptingTools.jl b/src/PromptingTools.jl
@@ -50,6 +50,7 @@ include("extraction.jl")
 
 ## Individual interfaces
 include("llm_openai.jl")
+include("llm_ollama_managed.jl")
 
 ## Convenience utils
 export @ai_str, @aai_str

diff --git a/src/llm_interface.jl b/src/llm_interface.jl
@@ -58,6 +58,7 @@ It uses the following conversation structure:
 struct ChatMLSchema <: AbstractChatMLSchema end
 
 abstract type AbstractManagedSchema <: AbstractPromptSchema end
+abstract type AbstractOllamaManagedSchema <: AbstractManagedSchema end
 
 """
 Ollama by default manages different models and their associated prompt schemas when you pass `system_prompt` and `prompt` fields to the API.
@@ -67,7 +68,15 @@ Warning: It works only for 1 system message and 1 user message, so anything more
 If you need to pass more messagese / longer conversational history, you can use define the model-specific schema directly and pass your Ollama requests with `raw=true`, 
  which disables and templating and schema management by Ollama.
 """
-struct OllamaManagedSchema <: AbstractManagedSchema end
+struct OllamaManagedSchema <: AbstractOllamaManagedSchema end
+
+"Echoes the user's input back to them. Used for testing the implementation"
+@kwdef mutable struct TestEchoOllamaManagedSchema <: AbstractOllamaManagedSchema
+    response::AbstractDict
+    status::Integer
+    model_id::String = ""
+    inputs::Any = nothing
+end
 
 ## Dispatch into default schema
 const PROMPT_SCHEMA = OpenAISchema()