Skip to content

Commit

Permalink
Update Docs
Browse files Browse the repository at this point in the history
Update docs
  • Loading branch information
svilupp authored Dec 23, 2023
2 parents b4502c6 + ff4e7fc commit b7fd28c
Show file tree
Hide file tree
Showing 8 changed files with 87 additions and 4,099 deletions.
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

### Fixed

## [0.5.0]

### Added
- Experimental sub-module RAGTools providing basic Retrieval-Augmented Generation functionality. See `?RAGTools` for more information. It's all nested inside of `PromptingTools.Experimental.RAGTools` to signify that it might change in the future. Key functions are `build_index` and `airag`, but it also provides a suite to make evaluation easier (see `?build_qa_evals` and `?run_qa_evals` or just see the example `examples/building_RAG.jl`)

### Fixed
- Stricter code parsing in `AICode` to avoid false positives (code blocks must end with "```\n" to catch comments inside text)
- Introduced an option `skip_invalid=true` for `AICode`, which allows you to include only code blocks that parse successfully (useful when the code definition is good, but the subsequent examples are not), and an option `capture_stdout=false` to avoid capturing stdout if you want to evaluate `AICode` in parallel (`Pipe()` that we use is NOT thread-safe)
- `OllamaManagedSchema` was passing an incorrect model name to the Ollama server, often serving the default llama2 model instead of the requested model. This is now fixed.
- Fixed a bug in kwarg `model` handling when leveraging PT.MODEL_REGISTRY

## [0.4.0]

Expand Down
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "PromptingTools"
uuid = "670122d1-24a8-4d70-bfce-740807c42192"
authors = ["J S @svilupp and contributors"]
version = "0.5.0-DEV"
version = "0.5.0"

[deps]
Base64 = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
Expand Down
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ makedocs(;
"Various examples" => "examples/readme_examples.md",
"Using AITemplates" => "examples/working_with_aitemplates.md",
"Local models with Ollama.ai" => "examples/working_with_ollama.md",
"Custom APIs (Mistral, Llama.cpp)" => "examples/working_with_custom_apis.md",
"Building RAG Application" => "examples/building_RAG.md",
],
"F.A.Q." => "frequently_asked_questions.md",
Expand Down
2 changes: 1 addition & 1 deletion docs/src/examples/building_RAG.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ We're done for today!
- Add filtering for semantic similarity (embedding distance) to make sure we don't pick up irrelevant chunks in the context
- Use multiple indices or a hybrid index (add a simple BM25 lookup from TextAnalysis.jl)
- Data processing is the most important step - properly parsed and split text could make wonders
- Add re-ranking of context (see `rerank` function, you can use Cohere ReRank API)`)
- Add re-ranking of context (see `rerank` function, you can use Cohere ReRank API)
- Improve the question embedding (eg, rephrase it, generate hypothetical answers and use them to find better context)

... and much more! See some ideas in [Anyscale RAG tutorial](https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1)
Expand Down
69 changes: 69 additions & 0 deletions docs/src/examples/working_with_custom_apis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Custom APIs

PromptingTools allows you to use any OpenAI-compatible API (eg, MistralAI), including a locally hosted one like the server from `llama.cpp`.

````julia
using PromptingTools
const PT = PromptingTools
````

## Using MistralAI

Mistral models have long been dominating the open-source space. They are now available via their API, so you can use them with PromptingTools.jl!

```julia
msg = aigenerate("Say hi!"; model="mistral-tiny")
# [ Info: Tokens: 114 @ Cost: $0.0 in 0.9 seconds
# AIMessage("Hello there! I'm here to help answer any questions you might have, or assist you with tasks to the best of my abilities. How can I be of service to you today? If you have a specific question, feel free to ask and I'll do my best to provide accurate and helpful information. If you're looking for general assistance, I can help you find resources or information on a variety of topics. Let me know how I can help.")
```

It all just works, because we have registered the models in the `PromptingTools.MODEL_REGISTRY`! There are currently 4 models available: `mistral-tiny`, `mistral-small`, `mistral-medium`, `mistral-embed`.

Under the hood, we use a dedicated schema `MistralOpenAISchema` that leverages most of the OpenAI-specific code base, so you can always provide that explicitly as the first argument:

```julia
const PT = PromptingTools
msg = aigenerate(PT.MistralOpenAISchema(), "Say Hi!"; model="mistral-tiny", api_key=ENV["MISTRALAI_API_KEY"])
```
As you can see, we can load your API key either from the ENV or via the Preferences.jl mechanism (see `?PREFERENCES` for more information).

## Using other OpenAI-compatible APIs

MistralAI are not the only ones who mimic the OpenAI API!
There are many other exciting providers, eg, [Perplexity.ai](https://docs.perplexity.ai/), [Fireworks.ai](https://app.fireworks.ai/).

As long as they are compatible with the OpenAI API (eg, sending `messages` with `role` and `content` keys), you can use them with PromptingTools.jl by using `schema = CustomOpenAISchema()`:

```julia
# Set your API key and the necessary base URL for the API
api_key = "..."
provider_url = "..." # provider API URL
prompt = "Say hi!"
msg = aigenerate(PT.CustomOpenAISchema(), prompt; model="<some-model>", api_key, api_kwargs=(; url=provider_url))
```

> [!TIP]
> If you register the model names with `PT.register_model!`, you won't have to keep providing the `schema` manually.
Note: At the moment, we only support `aigenerate` and `aiembed` functions.

## Using llama.cpp server

In line with the above, you can also use the [`llama.cpp` server](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md).

It is a bit more technically demanding because you need to "compile" `llama.cpp` first, but it will always have the latest models and it is quite fast (eg, faster than Ollama, which uses llama.cpp under the hood but has some extra overhead).

Start your server in a command line (`-m` refers to the model file, `-c` is the context length, `-ngl` is the number of layers to offload to GPU):

```bash
./server -m models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf -c 2048 -ngl 99
```

Then simply access it via PromptingTools:

```julia
msg = aigenerate(PT.CustomOpenAISchema(), "Count to 5 and say hi!"; api_kwargs=(; url="http://localhost:8080/v1"))
```

> [!TIP]
> If you register the model names with `PT.register_model!`, you won't have to keep providing the `schema` manually. It can be any `model` name, because the model is actually selected when you start the server in the terminal.
Loading

0 comments on commit b7fd28c

Please sign in to comment.