TurboPilot is a C++ program that uses the GGML project to parse and run language models.
To build turbopilot you will need CMake, Libboost, a C++ toolchain and GNU Make.
On Ubuntu you can install these things with:
sudo apt-get update
sudo apt-get install libboost-dev cmake build-essential
If you use brew you can simply add these dependencies by running:
brew install cmake boost
Make sure the ggml subproject is checked out with git submodule init
and git submodule update
Configure cmake to build the project with the following:
mkdir build
cd build
cmake ..
If you are running on linux you can optionally compile a static build with cmake -D CMAKE_EXE_LINKER_FLAGS="-static" ..
which should make your binary portable across different flavours of the OS.
From here you can now build the components that make up turbopilot by running:
make
BLAS libraries accelerate mathematical operations. You can use the OpenBLAS implementation with Turbopilot to make generation faster - particularly for longer prompts.
When you run cmake, you can additionally set -D GGML_OPENBLAS=On
to enable BLAS support.
E.g. cmake .. -D GGML_OPENBLAS=On
CuBLAS is the BLAS library provided by nvidia that runs linear algebra code on your GPU. This can speed up the application significantly, especially when working with long prompts.
You will need nvcc
and the libcublas-dev
dependencies as a bare minimum. Follow the guide from nvidia here for more detailed installation instructions.
You will need to set -DGGML_CUBLAS=ON
and also pass the path to your nvcc
executable with -DCMAKE_CUDA_COMPILER=/path/to/nvcc
.
Full example: cmake -DGGML_CUBLAS=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc ..