Skip to content

Commit

Permalink
OpenRouter support, new OpenAI o1 models (#207)
Browse files Browse the repository at this point in the history
  • Loading branch information
svilupp authored Sep 15, 2024
1 parent 7038175 commit 9a75d2c
Show file tree
Hide file tree
Showing 11 changed files with 293 additions and 20 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added
- Added support for OpenAI's JSON mode for `aiextract` (just provide kwarg `json_mode=true`). Reference [Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs).
- Added support for OpenRouter's API (you must set ENV `OPENROUTER_API_KEY`) to provide access to more models like Cohere Command R+ and OpenAI's o1 series. Reference [OpenRouter](https://openrouter.ai/).
- Added new OpenRouter hosted models to the model registry (prefixed with `or`): `oro1` (OpenAI's o1-preview), `oro1m` (OpenAI's o1-mini), `orcop` (Cohere's command-r-plus), `orco` (Cohere's command-r). The `or` prefix is to avoid conflicts with existing models and OpenAI's aliases, then the goal is to provide 2 letters for each model and 1 letter for additional qualifier (eg, "p" for plus, "m" for mini) -> `orcop` (OpenRouter cohere's COmmand-r-Plus).

### Updated
- Updated FAQ with instructions on how to access new OpenAI o1 models via OpenRouter.
- Updated FAQ with instructions on how to add custom APIs (with an example `examples/adding_custom_API.jl`).

### Fixed
- Fixed a bug in `aiclassify` for the OpenAI GPT4o models that have a different tokenizer. Unknown model IDs will throw an error.
Expand Down
40 changes: 40 additions & 0 deletions docs/src/frequently_asked_questions.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,33 @@ Assuming the price per call was $0.0001, you'd pay 2 cents for the job and save
Resources:
- [OpenAI Pricing per 1000 tokens](https://openai.com/pricing)

## How to try new OpenAI models if I'm not Tier 5 customer?

As of September 2024, you cannot access the new o1 models via API unless you're a Tier 5 customer.

Fortunately, you can use OpenRouter to access these new models.

1) Get your API key from [OpenRouter](https://openrouter.ai/keys)
2) Add some minimum [Credits](https://openrouter.ai/credits) to the account (eg, $5).
3) Set it as an environment variable (or use local preferences): `ENV["OPENROUTER_API_KEY"] = "<your key>"`
4) Use the model aliases with `or` prefix, eg, `oro1` for o1-preview or `oro1m` for o1-mini.

Example:
```julia
# Let's use o1-preview model hosted on OpenRouter ("or" prefix)
msg = aigenerate("What is the meaning of life?"; model="oro1")
```

Note: There are some quirks for the o1 models.
For example, the new o1 series does NOT support `SystemMessage` yet, so OpenRouter does some tricks (likely converting them to normal user messages).
To be in control of this behavior and have comparable behavior to the native OpenAI API, you can use kwarg `no_system_message=true` in `aigenerate` to ensure OpenRouter does not do any tricks.

Example:
```julia
# Let's use o1-mini and disable adding automatic system message
msg = aigenerate("What is the meaning of life?"; model="oro1m", no_system_message=true)
```

## Configuring the Environment Variable for API Key

This is a guide for OpenAI's API key, but it works for any other API key you might need (eg, `MISTRALAI_API_KEY` for MistralAI API).
Expand Down Expand Up @@ -202,6 +229,19 @@ There are three ways how you can customize your workflows (especially when you u
2) Register your model and its associated schema (`PT.register_model!(; name="123", schema=PT.OllamaSchema())`). You won't have to specify the schema anymore only the model name. See [Working with Ollama](#working-with-ollama) for more information.
3) Override your default model (`PT.MODEL_CHAT`) and schema (`PT.PROMPT_SCHEMA`). It can be done persistently with Preferences, eg, `PT.set_preferences!("PROMPT_SCHEMA" => "OllamaSchema", "MODEL_CHAT"=>"llama2")`.

## Using Custom API Providers like Azure or Databricks

Several providers are directly supported (eg, Databricks), check the available prompt schemas (eg, `subtypes(PT.AbstractOpenAISchema)`).

If you need a custom URL or a few keyword parameters, refer to the implementation of DatabricksOpenAISchema.
You effectively need to create your own prompt schema (`struct MySchema <: PT.AbstractOpenAISchema`) and override the OpenAI.jl behavior. The easiest way is to provide your custom method for `OpenAI.create_chat` and customize the `url`, `api_key`, and other `kwargs` fields.
You can follow the implementation of `create_chat` for `DatabricksOpenAISchema` in `src/llm_openAI.jl`.

Once your schema is ready, you can register the necessary models via `PT.register_model!(; name="myschema", schema=MySchema())`.
You can also add aliases for easier access (eg, `PT.MODEL_ALIASES["mymodel"] = "my-model-with-really-long-name"`).

If you would like to use some heavily customized API, eg, your company's internal LLM proxy (to change headers, URL paths, etc.), refer to the example `examples/adding_custom_API.jl` in the repo.

## How to have Multi-turn Conversations?

Let's say you would like to respond back to a model's response. How to do it?
Expand Down
85 changes: 85 additions & 0 deletions examples/adding_custom_API.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Example of custom API integration, eg, custom enterprise proxy with special headers
#
# This should NOT be necessary unless you have a private LLM / private proxy with specialized API structure and headers.
# For most new APIs, you should check out the FAQ on "Using Custom API Providers like Azure or Databricks"
# DatabricksOpenAISchema is a good example how to do simple API integration.
#
# For heavily customized APIs, follow the example below. Again, do this only if you have no other choice!!

# We will need to provide a custom "provider" and custom methods for `OpenAI.jl` to override how it builds the AUTH headers and URL.

using PromptingTools
const PT = PromptingTools
using HTTP
using JSON3

## OpenAI.jl work
# Define a custom provider for OpenAI to override the default behavior
abstract type MyCustomProvider <: PT.AbstractCustomProvider end

@kwdef struct MyModelProvider <: MyCustomProvider
api_key::String = ""
base_url::String = "https://api.example.com/v1239123/modelxyz/completions_that_are_not_standard"
api_version::String = ""
end

# Tell OpenAI not to use "api" (=endpoints)
function PT.OpenAI.build_url(provider::MyCustomProvider, api::AbstractString = "")
string(provider.base_url)
end

function PT.OpenAI.auth_header(
provider::MyCustomProvider, api_key::AbstractString = provider.api_key)
## Note this DOES NOT have any Basic Auth! Assumes you use something custom
["Content-Type" => "application/json", "Extra-custom-authorization" => api_key]
end

## PromptingTools.jl work
# Define a custom schema
struct MyCustomSchema <: PT.AbstractOpenAISchema end

# Implement create_chat for the custom schema
function PT.OpenAI.create_chat(schema::MyCustomSchema,
api_key::AbstractString,
model::AbstractString,
conversation;
url::String = "",
## Add any required kwargs here, APIs may have different requirements
max_tokens::Int = 2048,
kwargs...)
## Depending on your needs, you can get api_key from ENV variable!!
## Eg, api_key = get(ENV, "CUSTOM_API_KEY", "")
provider = MyModelProvider(; api_key, base_url = url)

## The first arg will be ignored, doesn't matter what you put there
PT.OpenAI.openai_request("ignore-me", provider;
method = "POST",
messages = conversation,
streamcallback = nothing,
max_tokens = max_tokens,
model = model,
kwargs...)
end

## Model registration
## Any alias you like (can be many)
PromptingTools.MODEL_ALIASES["myprecious"] = "custom-model-xyz"
## Register the exact model name to send to your API
PromptingTools.register_model!(;
name = "custom-model-xyz",
schema = MyCustomSchema())

## Example usage
api_key = "..." # use ENV to provide this automatically
url = "..." # use ENV to provide this or hardcode in your create_chat function!!
msg = aigenerate("Hello, how are you?"; model = "myprecious", api_kwargs = (; api_key, url))

## Custom usage - no need to register anything
function myai(msg::AbstractString)
model = "custom-model-xyz"
schema = MyCustomSchema()
api_key = "..." # use ENV to provide this automatically
url = "..." # use ENV to provide this or hardcode in your create_chat function!!
aigenerate(schema, msg; model, api_kwargs = (; api_key, url))
end
msg = myai("Hello, how are you?")
15 changes: 12 additions & 3 deletions src/llm_anthropic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
messages::Vector{<:AbstractMessage};
aiprefill::Union{Nothing, AbstractString} = nothing,
conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
no_system_message::Bool = false,
tools::Vector{<:Dict{String, <:Any}} = Dict{String, Any}[],
cache::Union{Nothing, Symbol} = nothing,
kwargs...)
Expand All @@ -18,13 +19,15 @@ Builds a history of the conversation to provide the prompt to the API. All unspe
# Keyword Arguments
- `aiprefill`: A string to be used as a prefill for the AI response. This steer the AI response in a certain direction (and potentially save output tokens).
- `conversation`: Past conversation to be included in the beginning of the prompt (for continued conversations).
- `no_system_message`: If `true`, do not include the default system message in the conversation history OR convert any provided system message to a user message.
- `tools`: A list of tools to be used in the conversation. Added to the end of the system prompt to enforce its use.
- `cache`: A symbol representing the caching strategy to be used. Currently only `nothing` (no caching), `:system`, `:tools`,`:last` and `:all` are supported.
"""
function render(schema::AbstractAnthropicSchema,
messages::Vector{<:AbstractMessage};
aiprefill::Union{Nothing, AbstractString} = nothing,
conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
no_system_message::Bool = false,
tools::Vector{<:Dict{String, <:Any}} = Dict{String, Any}[],
cache::Union{Nothing, Symbol} = nothing,
kwargs...)
Expand All @@ -35,7 +38,8 @@ function render(schema::AbstractAnthropicSchema,
system = nothing

## First pass: keep the message types but make the replacements provided in `kwargs`
messages_replaced = render(NoSchema(), messages; conversation, kwargs...)
messages_replaced = render(
NoSchema(), messages; conversation, no_system_message, kwargs...)

## Second pass: convert to the message-based schema
conversation = Dict{String, Any}[]
Expand Down Expand Up @@ -73,7 +77,7 @@ function render(schema::AbstractAnthropicSchema,
if is_valid_conversation && (cache == :last || cache == :all)
conversation[end]["content"][end]["cache_control"] = Dict("type" => "ephemeral")
end
if !isnothing(system) && (cache == :system || cache == :all)
if !no_system_message && !isnothing(system) && (cache == :system || cache == :all)
## Apply cache for system message
system = [Dict("type" => "text", "text" => system,
"cache_control" => Dict("type" => "ephemeral"))]
Expand Down Expand Up @@ -214,6 +218,7 @@ end
return_all::Bool = false, dry_run::Bool = false,
conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
streamcallback::Any = nothing,
no_system_message::Bool = false,
aiprefill::Union{Nothing, AbstractString} = nothing,
http_kwargs::NamedTuple = NamedTuple(), api_kwargs::NamedTuple = NamedTuple(),
cache::Union{Nothing, Symbol} = nothing,
Expand All @@ -232,6 +237,7 @@ Generate an AI response based on a given prompt using the Anthropic API.
- `conversation::AbstractVector{<:AbstractMessage}=[]`: Not allowed for this schema. Provided only for compatibility.
- `streamcallback::Any`: A callback function to handle streaming responses. Can be simply `stdout` or `StreamCallback` object. See `?StreamCallback` for details.
Note: We configure the `StreamCallback` (and necessary `api_kwargs`) for you, unless you specify the `flavor`. See `?configure_callback!` for details.
- `no_system_message::Bool=false`: If `true`, do not include the default system message in the conversation history OR convert any provided system message to a user message.
- `aiprefill::Union{Nothing, AbstractString}`: A string to be used as a prefill for the AI response. This steer the AI response in a certain direction (and potentially save output tokens). It MUST NOT end with a trailing with space. Useful for JSON formatting.
- `http_kwargs::NamedTuple`: Additional keyword arguments for the HTTP request. Defaults to empty `NamedTuple`.
- `api_kwargs::NamedTuple`: Additional keyword arguments for the Ollama API. Defaults to an empty `NamedTuple`.
Expand Down Expand Up @@ -329,6 +335,7 @@ function aigenerate(
return_all::Bool = false, dry_run::Bool = false,
conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
streamcallback::Any = nothing,
no_system_message::Bool = false,
aiprefill::Union{Nothing, AbstractString} = nothing,
http_kwargs::NamedTuple = NamedTuple(), api_kwargs::NamedTuple = NamedTuple(),
cache::Union{Nothing, Symbol} = nothing,
Expand All @@ -339,7 +346,8 @@ function aigenerate(
@assert (isnothing(aiprefill)||!isempty(strip(aiprefill))) "`aiprefill` must not be empty`"
## Find the unique ID for the model alias provided
model_id = get(MODEL_ALIASES, model, model)
conv_rendered = render(prompt_schema, prompt; aiprefill, conversation, cache, kwargs...)
conv_rendered = render(
prompt_schema, prompt; no_system_message, aiprefill, conversation, cache, kwargs...)

if !dry_run
@info conv_rendered.conversation
Expand Down Expand Up @@ -383,6 +391,7 @@ function aigenerate(
conversation,
return_all,
dry_run,
no_system_message,
kwargs...)
return output
end
Expand Down
15 changes: 12 additions & 3 deletions src/llm_google.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,21 +10,24 @@ end
render(schema::AbstractGoogleSchema,
messages::Vector{<:AbstractMessage};
conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
no_system_message::Bool = false,
kwargs...)
Builds a history of the conversation to provide the prompt to the API. All unspecified kwargs are passed as replacements such that `{{key}}=>value` in the template.
# Keyword Arguments
- `conversation`: An optional vector of `AbstractMessage` objects representing the conversation history. If not provided, it is initialized as an empty vector.
- `no_system_message::Bool=false`: If `true`, do not include the default system message in the conversation history OR convert any provided system message to a user message.
"""
function render(schema::AbstractGoogleSchema,
messages::Vector{<:AbstractMessage};
conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
no_system_message::Bool = false,
kwargs...)
##
## First pass: keep the message types but make the replacements provided in `kwargs`
messages_replaced = render(NoSchema(), messages; conversation, kwargs...)
messages_replaced = render(
NoSchema(), messages; conversation, no_system_message, kwargs...)

## Second pass: convert to the OpenAI schema
conversation = Dict{Symbol, Any}[]
Expand Down Expand Up @@ -78,6 +81,8 @@ end
verbose::Bool = true,
api_key::String = GOOGLE_API_KEY,
model::String = "gemini-pro", return_all::Bool = false, dry_run::Bool = false,
conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
no_system_message::Bool = false,
http_kwargs::NamedTuple = (retry_non_idempotent = true,
retries = 5,
readtimeout = 120), api_kwargs::NamedTuple = NamedTuple(),
Expand All @@ -98,6 +103,7 @@ Note:
- `return_all::Bool=false`: If `true`, returns the entire conversation history, otherwise returns only the last message (the `AIMessage`).
- `dry_run::Bool=false`: If `true`, skips sending the messages to the model (for debugging, often used with `return_all=true`).
- `conversation`: An optional vector of `AbstractMessage` objects representing the conversation history. If not provided, it is initialized as an empty vector.
- `no_system_message::Bool=false`: If `true`, do not include the default system message in the conversation history OR convert any provided system message to a user message.
- `http_kwargs`: A named tuple of HTTP keyword arguments.
- `api_kwargs`: A named tuple of API keyword arguments.
- `kwargs`: Prompt variables to be used to fill the prompt/template
Expand Down Expand Up @@ -151,6 +157,7 @@ function aigenerate(prompt_schema::AbstractGoogleSchema, prompt::ALLOWED_PROMPT_
api_key::String = GOOGLE_API_KEY,
model::String = "gemini-pro", return_all::Bool = false, dry_run::Bool = false,
conversation::AbstractVector{<:AbstractMessage} = AbstractMessage[],
no_system_message::Bool = false,
http_kwargs::NamedTuple = (retry_non_idempotent = true,
retries = 5,
readtimeout = 120), api_kwargs::NamedTuple = NamedTuple(),
Expand All @@ -166,7 +173,8 @@ function aigenerate(prompt_schema::AbstractGoogleSchema, prompt::ALLOWED_PROMPT_

## Find the unique ID for the model alias provided
model_id = get(MODEL_ALIASES, model, model)
conv_rendered = render(prompt_schema, prompt; conversation, kwargs...)
conv_rendered = render(
prompt_schema, prompt; conversation, no_system_message, kwargs...)

if !dry_run
time = @elapsed r = ggi_generate_content(prompt_schema, api_key,
Expand Down Expand Up @@ -195,6 +203,7 @@ function aigenerate(prompt_schema::AbstractGoogleSchema, prompt::ALLOWED_PROMPT_
conversation,
return_all,
dry_run,
no_system_message,
kwargs...)

return output
Expand Down
15 changes: 15 additions & 0 deletions src/llm_interface.jl
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,21 @@ Requires one environment variables to be set:
"""
struct DeepSeekOpenAISchema <: AbstractOpenAISchema end

"""
OpenRouterOpenAISchema
Schema to call the [OpenRouter](https://openrouter.ai/) API.
Links:
- [Get your API key](https://openrouter.ai/keys)
- [API Reference](https://openrouter.ai/docs)
- [Available models](https://openrouter.ai/models)
Requires one environment variable to be set:
- `OPENROUTER_API_KEY`: Your API key
"""
struct OpenRouterOpenAISchema <: AbstractOpenAISchema end

abstract type AbstractOllamaSchema <: AbstractPromptSchema end

"""
Expand Down
Loading

0 comments on commit 9a75d2c

Please sign in to comment.