Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : add llama_model_load_from_splits #11255

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Jan 15, 2025

Some downstream program may want to use non-conventional file name. For example, ollama is using SHA256 as file name. This can make adding support for multi-splits GGUF become tricky.

This PR adds a new API llama_model_load_from_splits that allow user to manually specify a list of GGUF files:

    // Load the model from multiple splits (support custom naming scheme)
    // The paths must be in the correct order
    LLAMA_API struct llama_model * llama_model_load_from_splits(
                             const char ** paths,
                                 size_t    n_paths,
              struct llama_model_params    params);

@ngxson ngxson requested a review from ggerganov January 15, 2025 16:59
src/llama.cpp Show resolved Hide resolved
Comment on lines 169 to 171
// return a list of splits for a given path
// for example, given "<name>-00002-of-00004.gguf", returns list of all 4 splits
std::vector<std::string> llama_get_list_splits(const std::string & path, const int n_split);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be static function in the source file only - no need to add it in the header.

There is also an existing a llama_split_ prefix which seems suitable to use for this function: llama_split_get_list()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, I wanted to use this in llama.cpp but decided not to do that in the end. Forgot to delete it in the header file.

It should be fixed with 49822ba

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants