OpenRouter support, new OpenAI o1 models (#207)

svilupp · Sep 15, 2024 · 9a75d2c · 9a75d2c
1 parent 7038175
commit 9a75d2c
Show file tree

Hide file tree

Showing 11 changed files with 293 additions and 20 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -14,6 +14,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Added
 - Added support for OpenAI's JSON mode for `aiextract` (just provide kwarg `json_mode=true`). Reference [Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs).
+- Added support for OpenRouter's API (you must set ENV `OPENROUTER_API_KEY`) to provide access to more models like Cohere Command R+ and OpenAI's o1 series. Reference [OpenRouter](https://openrouter.ai/).
+- Added new OpenRouter hosted models to the model registry (prefixed with `or`): `oro1` (OpenAI's o1-preview), `oro1m` (OpenAI's o1-mini), `orcop` (Cohere's command-r-plus), `orco` (Cohere's command-r). The `or` prefix is to avoid conflicts with existing models and OpenAI's aliases, then the goal is to provide 2 letters for each model and 1 letter for additional qualifier (eg, "p" for plus, "m" for mini) -> `orcop` (OpenRouter cohere's COmmand-r-Plus).
+
+### Updated
+- Updated FAQ with instructions on how to access new OpenAI o1 models via OpenRouter.
+- Updated FAQ with instructions on how to add custom APIs (with an example `examples/adding_custom_API.jl`).
 
 ### Fixed
 - Fixed a bug in `aiclassify` for the OpenAI GPT4o models that have a different tokenizer. Unknown model IDs will throw an error.

diff --git a/docs/src/frequently_asked_questions.md b/docs/src/frequently_asked_questions.md
@@ -119,6 +119,33 @@ Assuming the price per call was $0.0001, you'd pay 2 cents for the job and save
 Resources:
 - [OpenAI Pricing per 1000 tokens](https://openai.com/pricing)
 
+## How to try new OpenAI models if I'm not Tier 5 customer?
+
+As of September 2024, you cannot access the new o1 models via API unless you're a Tier 5 customer.
+
+Fortunately, you can use OpenRouter to access these new models.
+
+1) Get your API key from [OpenRouter](https://openrouter.ai/keys)
+2) Add some minimum [Credits](https://openrouter.ai/credits) to the account (eg, $5).
+3) Set it as an environment variable (or use local preferences): `ENV["OPENROUTER_API_KEY"] = "<your key>"`
+4) Use the model aliases with `or` prefix, eg, `oro1` for o1-preview or `oro1m` for o1-mini.
+
+Example:
+```julia
+# Let's use o1-preview model hosted on OpenRouter ("or" prefix)
+msg = aigenerate("What is the meaning of life?"; model="oro1")
+```
+
+Note: There are some quirks for the o1 models. 
+For example, the new o1 series does NOT support `SystemMessage` yet, so OpenRouter does some tricks (likely converting them to normal user messages).
+To be in control of this behavior and have comparable behavior to the native OpenAI API, you can use kwarg `no_system_message=true` in `aigenerate` to ensure OpenRouter does not do any tricks.
+
+Example:
+```julia
+# Let's use o1-mini and disable adding automatic system message
+msg = aigenerate("What is the meaning of life?"; model="oro1m", no_system_message=true)
+```
+
 ## Configuring the Environment Variable for API Key
 
 This is a guide for OpenAI's API key, but it works for any other API key you might need (eg, `MISTRALAI_API_KEY` for MistralAI API).
@@ -202,6 +229,19 @@ There are three ways how you can customize your workflows (especially when you u
 2) Register your model and its associated schema  (`PT.register_model!(; name="123", schema=PT.OllamaSchema())`). You won't have to specify the schema anymore only the model name. See [Working with Ollama](#working-with-ollama) for more information.
 3) Override your default model (`PT.MODEL_CHAT`) and schema (`PT.PROMPT_SCHEMA`). It can be done persistently with Preferences, eg, `PT.set_preferences!("PROMPT_SCHEMA" => "OllamaSchema", "MODEL_CHAT"=>"llama2")`.
 
+## Using Custom API Providers like Azure or Databricks
+
+Several providers are directly supported (eg, Databricks), check the available prompt schemas (eg, `subtypes(PT.AbstractOpenAISchema)`).
+
+If you need a custom URL or a few keyword parameters, refer to the implementation of DatabricksOpenAISchema.
+You effectively need to create your own prompt schema (`struct MySchema <: PT.AbstractOpenAISchema`) and override the OpenAI.jl behavior. The easiest way is to provide your custom method for `OpenAI.create_chat` and customize the `url`, `api_key`, and other `kwargs` fields.
+You can follow the implementation of `create_chat` for `DatabricksOpenAISchema` in `src/llm_openAI.jl`.
+
+Once your schema is ready, you can register the necessary models via `PT.register_model!(; name="myschema", schema=MySchema())`.
+You can also add aliases for easier access (eg, `PT.MODEL_ALIASES["mymodel"] = "my-model-with-really-long-name"`).
+
+If you would like to use some heavily customized API, eg, your company's internal LLM proxy (to change headers, URL paths, etc.), refer to the example `examples/adding_custom_API.jl` in the repo.
+
 ## How to have Multi-turn Conversations?
 
 Let's say you would like to respond back to a model's response. How to do it?

diff --git a/examples/adding_custom_API.jl b/examples/adding_custom_API.jl
@@ -0,0 +1,85 @@
+# Example of custom API integration, eg, custom enterprise proxy with special headers
+#
+# This should NOT be necessary unless you have a private LLM / private proxy with specialized API structure and headers.
+# For most new APIs, you should check out the FAQ on "Using Custom API Providers like Azure or Databricks"
+# DatabricksOpenAISchema is a good example how to do simple API integration.
+#
+# For heavily customized APIs, follow the example below. Again, do this only if you have no other choice!!
+
+# We will need to provide a custom "provider" and custom methods for `OpenAI.jl` to override how it builds the AUTH headers and URL.
+
+using PromptingTools
+const PT = PromptingTools
+using HTTP
+using JSON3
+
+## OpenAI.jl work
+# Define a custom provider for OpenAI to override the default behavior
+abstract type MyCustomProvider <: PT.AbstractCustomProvider end
+
+@kwdef struct MyModelProvider <: MyCustomProvider
+    api_key::String = ""
+    base_url::String = "https://api.example.com/v1239123/modelxyz/completions_that_are_not_standard"
+    api_version::String = ""
+end
+
+# Tell OpenAI not to use "api" (=endpoints)
+function PT.OpenAI.build_url(provider::MyCustomProvider, api::AbstractString = "")
+    string(provider.base_url)
+end
+
+function PT.OpenAI.auth_header(
+        provider::MyCustomProvider, api_key::AbstractString = provider.api_key)
+    ## Note this DOES NOT have any Basic Auth! Assumes you use something custom
+    ["Content-Type" => "application/json", "Extra-custom-authorization" => api_key]
+end
+
+## PromptingTools.jl work
+# Define a custom schema
+struct MyCustomSchema <: PT.AbstractOpenAISchema end
+
+# Implement create_chat for the custom schema
+function PT.OpenAI.create_chat(schema::MyCustomSchema,
+        api_key::AbstractString,
+        model::AbstractString,
+        conversation;
+        url::String = "",
+        ## Add any required kwargs here, APIs may have different requirements
+        max_tokens::Int = 2048,
+        kwargs...)
+    ## Depending on your needs, you can get api_key from ENV variable!!
+    ## Eg, api_key = get(ENV, "CUSTOM_API_KEY", "")
+    provider = MyModelProvider(; api_key, base_url = url)
+
+    ## The first arg will be ignored, doesn't matter what you put there
+    PT.OpenAI.openai_request("ignore-me", provider;
+        method = "POST",
+        messages = conversation,
+        streamcallback = nothing,
+        max_tokens = max_tokens,
+        model = model,
+        kwargs...)
+end
+
+## Model registration
+## Any alias you like (can be many)
+PromptingTools.MODEL_ALIASES["myprecious"] = "custom-model-xyz"
+## Register the exact model name to send to your API
+PromptingTools.register_model!(;
+    name = "custom-model-xyz",
+    schema = MyCustomSchema())
+
+## Example usage
+api_key = "..." # use ENV to provide this automatically
+url = "..."  # use ENV to provide this or hardcode in your create_chat function!!
+msg = aigenerate("Hello, how are you?"; model = "myprecious", api_kwargs = (; api_key, url))
+
+## Custom usage - no need to register anything
+function myai(msg::AbstractString)
+    model = "custom-model-xyz"
+    schema = MyCustomSchema()
+    api_key = "..." # use ENV to provide this automatically
+    url = "..."  # use ENV to provide this or hardcode in your create_chat function!!
+    aigenerate(schema, msg; model, api_kwargs = (; api_key, url))
+end
+msg = myai("Hello, how are you?")
diff --git a/src/llm_anthropic.jl b/src/llm_anthropic.jl
@@ -9,6 +9,7 @@
         messages::Vector{<:AbstractMessage};
         aiprefill::Union{Nothing, AbstractString} = nothing,
         conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
+        no_system_message::Bool = false,
         tools::Vector{<:Dict{String, <:Any}} = Dict{String, Any}[],
         cache::Union{Nothing, Symbol} = nothing,
         kwargs...)
@@ -18,13 +19,15 @@ Builds a history of the conversation to provide the prompt to the API. All unspe
 # Keyword Arguments
 - `aiprefill`: A string to be used as a prefill for the AI response. This steer the AI response in a certain direction (and potentially save output tokens).
 - `conversation`: Past conversation to be included in the beginning of the prompt (for continued conversations).
+- `no_system_message`: If `true`, do not include the default system message in the conversation history OR convert any provided system message to a user message.
 - `tools`: A list of tools to be used in the conversation. Added to the end of the system prompt to enforce its use.
 - `cache`: A symbol representing the caching strategy to be used. Currently only `nothing` (no caching), `:system`, `:tools`,`:last` and `:all` are supported.
 """
 function render(schema::AbstractAnthropicSchema,
         messages::Vector{<:AbstractMessage};
         aiprefill::Union{Nothing, AbstractString} = nothing,
         conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
+        no_system_message::Bool = false,
         tools::Vector{<:Dict{String, <:Any}} = Dict{String, Any}[],
         cache::Union{Nothing, Symbol} = nothing,
         kwargs...)
@@ -35,7 +38,8 @@ function render(schema::AbstractAnthropicSchema,
     system = nothing
 
     ## First pass: keep the message types but make the replacements provided in `kwargs`
-    messages_replaced = render(NoSchema(), messages; conversation, kwargs...)
+    messages_replaced = render(
+        NoSchema(), messages; conversation, no_system_message, kwargs...)
 
     ## Second pass: convert to the message-based schema
     conversation = Dict{String, Any}[]
@@ -73,7 +77,7 @@ function render(schema::AbstractAnthropicSchema,
     if is_valid_conversation && (cache == :last || cache == :all)
         conversation[end]["content"][end]["cache_control"] = Dict("type" => "ephemeral")
     end
-    if !isnothing(system) && (cache == :system || cache == :all)
+    if !no_system_message && !isnothing(system) && (cache == :system || cache == :all)
         ## Apply cache for system message
         system = [Dict("type" => "text", "text" => system,
             "cache_control" => Dict("type" => "ephemeral"))]
@@ -214,6 +218,7 @@ end
         return_all::Bool = false, dry_run::Bool = false,
         conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
         streamcallback::Any = nothing,
+        no_system_message::Bool = false,
         aiprefill::Union{Nothing, AbstractString} = nothing,
         http_kwargs::NamedTuple = NamedTuple(), api_kwargs::NamedTuple = NamedTuple(),
         cache::Union{Nothing, Symbol} = nothing,
@@ -232,6 +237,7 @@ Generate an AI response based on a given prompt using the Anthropic API.
 - `conversation::AbstractVector{<:AbstractMessage}=[]`: Not allowed for this schema. Provided only for compatibility.
 - `streamcallback::Any`: A callback function to handle streaming responses. Can be simply `stdout` or `StreamCallback` object. See `?StreamCallback` for details.
   Note: We configure the `StreamCallback` (and necessary `api_kwargs`) for you, unless you specify the `flavor`. See `?configure_callback!` for details.
+- `no_system_message::Bool=false`: If `true`, do not include the default system message in the conversation history OR convert any provided system message to a user message.
 - `aiprefill::Union{Nothing, AbstractString}`: A string to be used as a prefill for the AI response. This steer the AI response in a certain direction (and potentially save output tokens). It MUST NOT end with a trailing with space. Useful for JSON formatting.
 - `http_kwargs::NamedTuple`: Additional keyword arguments for the HTTP request. Defaults to empty `NamedTuple`.
 - `api_kwargs::NamedTuple`: Additional keyword arguments for the Ollama API. Defaults to an empty `NamedTuple`.
@@ -329,6 +335,7 @@ function aigenerate(
         return_all::Bool = false, dry_run::Bool = false,
         conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
         streamcallback::Any = nothing,
+        no_system_message::Bool = false,
         aiprefill::Union{Nothing, AbstractString} = nothing,
         http_kwargs::NamedTuple = NamedTuple(), api_kwargs::NamedTuple = NamedTuple(),
         cache::Union{Nothing, Symbol} = nothing,
@@ -339,7 +346,8 @@ function aigenerate(
     @assert (isnothing(aiprefill)||!isempty(strip(aiprefill))) "`aiprefill` must not be empty`"
     ## Find the unique ID for the model alias provided
     model_id = get(MODEL_ALIASES, model, model)
-    conv_rendered = render(prompt_schema, prompt; aiprefill, conversation, cache, kwargs...)
+    conv_rendered = render(
+        prompt_schema, prompt; no_system_message, aiprefill, conversation, cache, kwargs...)
 
     if !dry_run
         @info conv_rendered.conversation
@@ -383,6 +391,7 @@ function aigenerate(
         conversation,
         return_all,
         dry_run,
+        no_system_message,
         kwargs...)
     return output
 end

diff --git a/src/llm_google.jl b/src/llm_google.jl
@@ -10,21 +10,24 @@ end
     render(schema::AbstractGoogleSchema,
         messages::Vector{<:AbstractMessage};
         conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
+        no_system_message::Bool = false,
         kwargs...)
 
 Builds a history of the conversation to provide the prompt to the API. All unspecified kwargs are passed as replacements such that `{{key}}=>value` in the template.
 
 # Keyword Arguments
 - `conversation`: An optional vector of `AbstractMessage` objects representing the conversation history. If not provided, it is initialized as an empty vector.
-
+- `no_system_message::Bool=false`: If `true`, do not include the default system message in the conversation history OR convert any provided system message to a user message.
 """
 function render(schema::AbstractGoogleSchema,
         messages::Vector{<:AbstractMessage};
         conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
+        no_system_message::Bool = false,
         kwargs...)
     ##
     ## First pass: keep the message types but make the replacements provided in `kwargs`
-    messages_replaced = render(NoSchema(), messages; conversation, kwargs...)
+    messages_replaced = render(
+        NoSchema(), messages; conversation, no_system_message, kwargs...)
 
     ## Second pass: convert to the OpenAI schema
     conversation = Dict{Symbol, Any}[]
@@ -78,6 +81,8 @@ end
         verbose::Bool = true,
         api_key::String = GOOGLE_API_KEY,
         model::String = "gemini-pro", return_all::Bool = false, dry_run::Bool = false,
+        conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
+        no_system_message::Bool = false,
         http_kwargs::NamedTuple = (retry_non_idempotent = true,
             retries = 5,
             readtimeout = 120), api_kwargs::NamedTuple = NamedTuple(),
@@ -98,6 +103,7 @@ Note:
 - `return_all::Bool=false`: If `true`, returns the entire conversation history, otherwise returns only the last message (the `AIMessage`).
 - `dry_run::Bool=false`: If `true`, skips sending the messages to the model (for debugging, often used with `return_all=true`).
 - `conversation`: An optional vector of `AbstractMessage` objects representing the conversation history. If not provided, it is initialized as an empty vector.
+- `no_system_message::Bool=false`: If `true`, do not include the default system message in the conversation history OR convert any provided system message to a user message.
 - `http_kwargs`: A named tuple of HTTP keyword arguments.
 - `api_kwargs`: A named tuple of API keyword arguments.
 - `kwargs`: Prompt variables to be used to fill the prompt/template
@@ -151,6 +157,7 @@ function aigenerate(prompt_schema::AbstractGoogleSchema, prompt::ALLOWED_PROMPT_
         api_key::String = GOOGLE_API_KEY,
         model::String = "gemini-pro", return_all::Bool = false, dry_run::Bool = false,
         conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
+        no_system_message::Bool = false,
         http_kwargs::NamedTuple = (retry_non_idempotent = true,
             retries = 5,
             readtimeout = 120), api_kwargs::NamedTuple = NamedTuple(),
@@ -166,7 +173,8 @@ function aigenerate(prompt_schema::AbstractGoogleSchema, prompt::ALLOWED_PROMPT_
 
     ## Find the unique ID for the model alias provided
     model_id = get(MODEL_ALIASES, model, model)
-    conv_rendered = render(prompt_schema, prompt; conversation, kwargs...)
+    conv_rendered = render(
+        prompt_schema, prompt; conversation, no_system_message, kwargs...)
 
     if !dry_run
         time = @elapsed r = ggi_generate_content(prompt_schema, api_key,
@@ -195,6 +203,7 @@ function aigenerate(prompt_schema::AbstractGoogleSchema, prompt::ALLOWED_PROMPT_
         conversation,
         return_all,
         dry_run,
+        no_system_message,
         kwargs...)
 
     return output

diff --git a/src/llm_interface.jl b/src/llm_interface.jl
@@ -207,6 +207,21 @@ Requires one environment variables to be set:
 """
 struct DeepSeekOpenAISchema <: AbstractOpenAISchema end
 
+"""
+    OpenRouterOpenAISchema
+
+Schema to call the [OpenRouter](https://openrouter.ai/) API.
+
+Links:
+- [Get your API key](https://openrouter.ai/keys)
+- [API Reference](https://openrouter.ai/docs)
+- [Available models](https://openrouter.ai/models)
+
+Requires one environment variable to be set:
+- `OPENROUTER_API_KEY`: Your API key
+"""
+struct OpenRouterOpenAISchema <: AbstractOpenAISchema end
+
 abstract type AbstractOllamaSchema <: AbstractPromptSchema end
 
 """