Releases: svilupp/PromptingTools.jl
Releases · svilupp/PromptingTools.jl
v0.13.0
PromptingTools v0.13.0
Added
- Added initial support for Google Gemini models for
aigenerate
(requires environment variableGOOGLE_API_KEY
and package GoogleGenAI.jl to be loaded). It must be loaded explicitly as it's not yet registered. - Added a utility to compare any two string sequences (and other iterators)
length_longest_common_subsequence
. It can be used to fuzzy match strings (eg, detecting context/sources in an AI-generated response or fuzzy matching AI response to some preset categories). See the docstring for more information?length_longest_common_subsequence
. - Rewrite of
aiclassify
to classify into an arbitrary list of categories (including with descriptions). It's a quick and easy option for "routing" and similar use cases, as it exploits the logit bias trick and outputs only 1 token. Currently, onlyOpenAISchema
is supported. See?aiclassify
for more information. - Initial support for multiple completions in one request for OpenAI-compatible API servers. Set via API kwarg
n=5
and it will request 5 completions in one request, saving the network communication time and paying the prompt tokens only once. It's useful for majority voting, diversity, or challenging agentic workflows. - Added new fields to
AIMessage
andDataMessage
types to simplify tracking in complex applications. Added fields:cost
- the cost of the query (summary per call, so count only once if you requested multiple completions in one call)log_prob
- summary log probability of the generated sequence, set API kwarglogprobs=true
to receive itrun_id
- ID of the AI API callsample_id
- ID of the sample in the batch if you requested multiple completions, otherwisesample_id==nothing
(they will have the samerun_id
)finish_reason
- the reason why the AI stopped generating the sequence (eg, "stop", "length") to provide more visibility for the user
- Support for Fireworks.ai and Together.ai providers for fast and easy access to open-source models. Requires environment variables
FIREWORKS_API_KEY
andTOGETHER_API_KEY
to be set, respectively. See the?FireworksOpenAISchema
and?TogetherOpenAISchema
for more information. - Added an
extra
field toChunkIndex
object for RAG workloads to allow additional flexibility with metadata for each document chunk (assumed to be a vector of the same length as the document chunks). - Added
airetry
function toPromptingTools.Experimental.AgentTools
to allow "guided" automatic retries of the AI calls (eg,AIGenerate
which is the "lazy" counterpart ofaigenerate
) if a given condition fails. It's useful for robustness and reliability in agentic workflows. You can provide conditions as functions and the same holds for feedback to the model as well. See a guessing game example in?airetry
.
Updated
- Updated names of endpoints and prices of Mistral.ai models as per the latest announcement and pricing. Eg,
mistral-small
->mistral-small-latest
. In addition, the latest Mistral model has been addedmistral-large-latest
(aliased asmistral-large
andmistrall
, same for the others).mistral-small-latest
andmistral-large-latest
now support function calling, which means they will work withaiextract
(You need to explicitly providetool_choice
, see the docs?aiextract
).
Removed
- Removed package extension for GoogleGenAI.jl, as it's not yet registered. Users must load the code manually for now.
Commits
Merged pull requests:
v0.12.0
PromptingTools v0.12.0
Added
- Added more specific kwargs in
Experimental.RAGTools.airag
to give more control over each type of AI call (ie,aiembed_kwargs
,aigenerate_kwargs
,aiextract_kwargs
) - Move up compat bounds for OpenAI.jl to 0.9
Fixed
- Fixed a bug where obtaining an API_KEY from ENV would get precompiled as well, causing an error if the ENV was not set at the time of precompilation. Now, we save the
get(ENV...)
into a separate variable to avoid being compiled away.
Commits
Merged pull requests:
v0.11.0
PromptingTools v0.11.0
Added
- Support for Databricks Foundation Models API. Requires two environment variables to be set:
DATABRICKS_API_KEY
andDATABRICKS_HOST
(the part of the URL before/serving-endpoints/
) - Experimental support for API tools to enhance your LLM workflows:
Experimental.APITools.create_websearch
function which can execute and summarize a web search (incl. filtering on specific domains). It requiresTAVILY_API_KEY
to be set in the environment. Get your own key from Tavily - the free tier enables c. 1000 searches/month, which should be more than enough to get started.
Fixed
- Added an option to reduce the "batch size" for the embedding step in building the RAG index (
build_index
,get_embeddings
). Setembedding_kwargs = (; target_batch_size_length=10_000, ntasks=1)
if you're having some limit issues with your provider. - Better error message if RAGTools are only partially imported (requires
LinearAlgebra
andSparseArrays
to load the extension).
### Commits
Merged pull requests:
v0.10.0
PromptingTools v0.10.0
Added
- [BREAKING CHANGE] The default embedding model (
MODEL_EMBEDDING
) changes to "text-embedding-3-small" effectively immediately (lower cost, higher performance). The default chat model (MODEL_CHAT
) will be changed by OpenAI to 0125 (from 0613) by mid-February. If you have older embeddings or rely on the exact chat model version, please set the model explicitly in your code or in your preferences. - New OpenAI models added to the model registry (see the release notes).
- "gpt4t" refers to whichever is the latest GPT-4 Turbo model ("gpt-4-0125-preview" at the time of writing)
- "gpt3t" refers to the latest GPT-3.5 Turbo model version 0125, which is 25-50% cheaper and has updated knowledge (available from February 2024, you will get an error in the interim)
- "gpt3" still refers to the general endpoint "gpt-3.5-turbo", which OpenAI will move to version 0125 by mid-February (ie, "gpt3t" will be the same as "gpt3" then. We have reflected the approximate cost in the model registry but note that it will be incorrect in the transition period)
- "emb3small" refers to the small version of the new embedding model (dim=1536), which is 5x cheaper than Ada and promises higher quality
- "emb3large" refers to the large version of the new embedding model (dim=3072), which is only 30% more expensive than Ada
- Improved AgentTools: added more information and specific methods to
aicode_feedback
anderror_feedback
to pass more targeted feedback/tips to the AIAgent - Improved detection of which lines were the source of error during
AICode
evaluation + forcing the error details to be printed inAICode(...).stdout
for downstream analysis. - Improved detection of Base/Main method overrides in
AICode
evaluation (only warns about the fact), but you can usedetect_base_main_overrides(code)
for custom handling
Fixed
- Fixed typos in the documentation
- Fixed a bug when API keys set in ENV would not be picked up by the package (caused by inlining of the
get(ENV,...)
during precompilation) - Fixed string interpolation to be correctly escaped when evaluating
AICode
### Commits
Merged pull requests:
- Fix tests on model costs (#58) (@svilupp)
- Re-apply format (#59) (@svilupp)
- Add devcontainer.json (#60) (@svilupp)
- Fix API key getter with @noinline (#61) (@svilupp)
- Improve code feedback (#62) (@svilupp)
- Improve error capture + error lines capture (#63) (@svilupp)
- Escape fix in code loading (#64) (@svilupp)
- Detect Base method overrides (#65) (@svilupp)
- Tag v0.10 (#66) (@svilupp)
Closed issues:
- ERROR: ArgumentError: api_key cannot be empty (#57)
v0.9.0
PromptingTools v0.9.0
### Added
- Split
Experimental.RAGTools.build_index
into smaller functions to easier sharing with other packages (get_chunks
,get_embeddings
,get_metadata
) - Added support for Cohere-based RAG re-ranking strategy (and introduced associated
COHERE_API_KEY
global variable and ENV variable)
### Commits
Merged pull requests:
v0.8.1
PromptingTools v0.8.1
Fixed
- Fixed
split_by_length
to not mutateseparators
argument (appeared in RAG use cases where we repeatedly apply splits to different documents)
Commits
Merged pull requests:
v0.8.0
PromptingTools v0.8.0
Added
- Initial support for Llama.jl and other local servers. Once your server is started, simply use
model="local"
to route your queries to the local server, eg,ai"Say hi!"local
. Option to permanently set theLOCAL_SERVER
(URL) added to preference management. See?LocalServerOpenAISchema
for more information. - Added a new template
StorytellerExplainSHAP
(see the metadata)
Fixed
- Repeated calls to Ollama models were failing due to missing
prompt_eval_count
key in subsequent calls.
Commits
Merged pull requests:
- Fix typos (#49) (@pitmonticone)
- Add LocalServerOpenAISchema to support Llama.jl (#50) (@svilupp)
- Fix ollama repeated calls (#52) (@svilupp)
- New template and version update (#53) (@svilupp)
Closed issues:
- Ollama: repeated request with same prompt fails (#51)
v0.7.0
PromptingTools v0.7.0
Added
- Added new Experimental sub-module AgentTools introducing
AICall
(incl.AIGenerate
), andAICodeFixer
structs. The AICall struct provides a "lazy" wrapper for ai* functions, enabling efficient and flexible AI interactions and building Agentic workflows. - Added the first AI Agent:
AICodeFixer
which iteratively analyzes and improves any code provided by a LLM by evaluating it in a sandbox. It allows a lot of customization (templated responses, feedback function, etc.) See?AICodeFixer
for more information on usage and?aicodefixer_feedback
for the example implementation of the feedback function. - Added
@timeout
macro to allow for limiting the execution time of a block of code inAICode
viaexecution_timeout
kwarg (prevents infinite loops, etc.). See?AICode
for more information. - Added
preview(conversation)
utility that allows you to quickly preview the conversation in a Markdown format in your REPL. RequiresMarkdown
package for the extension to be loaded. - Added
ItemsExtract
convenience wrapper foraiextract
when you want to extract one or more of a specificreturn_type
(eg,return_type = ItemsExtract{MyMeasurement}
)
Fixed
- Fixed
aiembed
to accept any AbstractVector of documents (eg, a view of a vector of documents)
Commits
Merged pull requests:
v0.6.0
PromptingTools v0.6.0
Added
@ai_str
macros now support multi-turn conversations. Theai"something"
call will automatically remember the last conversation, so you can simply reply withai!"my-reply"
. If you send another message withai""
, you'll start a new conversation. Same for the asynchronous versionsaai""
andaai!""
.- Created a new default schema for Ollama models
OllamaSchema
(replacingOllamaManagedSchema
), which allows multi-turn conversations and conversations with images (eg, with Llava and Bakllava models).OllamaManagedSchema
has been kept for compatibility and as an example of a schema where one provides the prompt as a string (not dictionaries like OpenAI API).
Fixed
- Removed template
RAG/CreateQAFromContext
because it's a duplicate ofRAG/RAGCreateQAFromContext
Commits
Merged pull requests:
v0.5.0
PromptingTools v0.5.0
Added
- Experimental sub-module RAGTools providing basic Retrieval-Augmented Generation functionality. See
?RAGTools
for more information. It's all nested inside ofPromptingTools.Experimental.RAGTools
to signify that it might change in the future. Key functions arebuild_index
andairag
, but it also provides a suite to make evaluation easier (see?build_qa_evals
and?run_qa_evals
or just see the exampleexamples/building_RAG.jl
)
Fixed
- Stricter code parsing in
AICode
to avoid false positives (code blocks must end with "```\n" to catch comments inside text) - Introduced an option
skip_invalid=true
forAICode
, which allows you to include only code blocks that parse successfully (useful when the code definition is good, but the subsequent examples are not), and an optioncapture_stdout=false
to avoid capturing stdout if you want to evaluateAICode
in parallel (Pipe()
that we use is NOT thread-safe) OllamaManagedSchema
was passing an incorrect model name to the Ollama server, often serving the default llama2 model instead of the requested model. This is now fixed.- Fixed a bug in kwarg
model
handling when leveraging PT.MODEL_REGISTRY
Commits
Merged pull requests:
- fix AICode parser (#31) (@svilupp)
- Make stdout capture optional (#32) (@svilupp)
- Fallback parser to expect newlines (#33) (@svilupp)
- Fix model kwarg in Ollama (#34) (@svilupp)
- Enable ollama tests (#35) (@svilupp)
- Add RAG Tools (#36) (@svilupp)
- Update docs (#37) (@svilupp)
- Fix params kwarg in run_qa_evals (#38) (@svilupp)