cortex.onnx

cortex.onnx is a high-efficiency C++ inference engine for edge computing focusing on Windows platform using DirectML for GPU acceleration.

It is a dynamic library that can be loaded by any server at runtime.

Repo Structure

.
├── base -> Engine interface
├── examples -> Server example to integrate engine
├── onnxruntime-genai -> Upstream onnxruntime-genai
├── src -> Engine implementation
├── third-party -> Dependencies of the cortex.onnx project

Build from source

This guide provides step-by-step instructions for building cortex.onnx from source on Windows systems.

Clone the Repository

First, you need to clone the cortex.onnx repository:

git clone --recurse https://github.com/janhq/cortex.onnx.git

If you don't have git, you can download the source code as a file archive from cortex.onnx GitHub.

Build library with server example

On Windows Install CMake and MsBuild

# Build dependencies
./build_cortex_onnx.bat

# Build engine
mkdir build
cd build
cmake ..
cmake --build . --config Release -j4

# Build server example (from root repository)
mkdir -p examples/server/build
cd examples/server/build
cmake ..
cmake --build . --config Release -j4

Quickstart

Step 1: Downloading a Model

Clone a model from https://huggingface.co/cortexhub, checkout to dml branch

Step 2: Start server

On Windows

cd examples/server/build/Release
mkdir -p engines\cortex.onnx\
cp ..\..\..\..\build\Release\engine.dll engines\cortex.onnx\
cp ..\..\..\..\onnxruntime-genai\build\Release\*.dll .\
server.exe

Step 3: Load model

curl http://localhost:3928/loadmodel \
  -H 'Content-Type: application/json' \
  -d '{
    "model_path": "./model/llama3",
    "model_alias": "llama3",
    "system_prompt": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n",
    "user_prompt": "<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n",
    "ai_prompt": "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
  }'

Step 4: Making an Inference

curl http://localhost:3928/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Who won the world series in 2020?"
      }
    ],
    "model": "llama3"
  }'

Table of parameters

Parameter	Type	Description
`model_path`	String	The file path to the onnx model.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github		.github
base/cortex-common		base/cortex-common
examples/server		examples/server
onnxruntime-genai @ a7ca019		onnxruntime-genai @ a7ca019
src		src
third-party		third-party
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Makefile		Makefile
README.md		README.md
build_cortex_onnx.bat		build_cortex_onnx.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cortex.onnx

Repo Structure

Build from source

Clone the Repository

Build library with server example

Quickstart

About

Releases 8

Packages

Contributors 4

Languages

janhq/cortex.onnx

Folders and files

Latest commit

History

Repository files navigation

cortex.onnx

Repo Structure

Build from source

Clone the Repository

Build library with server example

Quickstart

About

Resources

Stars

Watchers

Forks

Releases 8

Packages 0

Contributors 4

Languages

Packages