Opt class for positional argument handling #10508

ericcurtin · 2024-11-26T04:40:07Z

Opt class for positional argument handling
Added support for positional arguments model and prompt. Added
functionality to download via strings like:

llama-run llama3
llama-run ollama://granite-code
llama-run ollama://granite-code:8b
llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf
llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf
llama-run https://example.com/some-file1.gguf
llama-run some-file2.gguf
llama-run file://some-file3.gguf

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

examples/run/run.cpp

ericcurtin · 2024-11-28T02:04:08Z

@slaren @ggerganov this is ready for review. The next PR after this will download a huggingface model if the model string starts with hf:// or huggingface:// like RamaLama using the pre-existing huggingface downloader code in llama.cpp

One thing that could be better is the output from that code. huggingface-cli has a much nice progress bar, etc. (python kinda makes it easy). But one step at a time I guess 😊

examples/run/run.cpp

examples/main/main.cpp

ericcurtin · 2024-12-09T12:49:54Z

This is good for re-review @slaren I can't figure out how to call this kinda code correctly:

    int huggingface_dl(const std::string & model_, const struct llama_model_params & params) {
        // Find the second occurrence of '/' after protocol string
        size_t pos = model_.find('/');
        pos        = model_.find('/', pos + 1);
        if (pos == std::string::npos) {
            return 1;
        }

        const std::string hfr = model_.substr(0, pos);
        const std::string hff = model_.substr(pos + 1);
        common_load_model_from_hf(hfr, hff, "", "", params);

        return 0;
    }

    int resolve_model(std::string & model_, const struct llama_model_params & params) {
        if (starts_with(model_, "hf://") || starts_with(model_, "huggingface://")) {
            remove_proto(model_);
            huggingface_dl(model_, params);
        } else if (starts_with(model_, "https://")) {
            common_load_model_from_url(model_, "", "", params);
        } else if (starts_with(model_, "file://")) {
            remove_proto(model_);
        }

        // Also implement ollama://, if file doesn't exist, assume ollama str

        return 0;
    }

so I left it out for now.

ericcurtin · 2024-12-10T13:05:55Z

@slaren @ggerganov On merge of this, one can start a chatbot via:

$ llama-run smollm
>

examples don't really get much simpler than this, from a user perspective at least...

slaren

Pretty cool. I would also appreciate some documentation about where the the model files are cached/stored, it's not very clear at the moment.

common/common.h

examples/run/CMakeLists.txt

examples/run/run.cpp

README.md

examples/run/run.cpp

ericcurtin · 2024-12-11T16:12:39Z

Pretty cool. I would also appreciate some documentation about where the the model files are cached/stored, it's not very clear at the moment.

It's basically like a "curl -O" or a wget, it just downloads it in the current directory as modelname.partial and then when the download is complete it's renamed to just modelname (that helps identify whether something is fully downloaded or not). It would be nice to have a full modelstore like RamaLama has, but maybe thats overkill for now.

I'll try and articulate that best I can in usage help, etc.

ericcurtin · 2024-12-11T16:14:19Z

This is practically the same code with minor differences:

https://github.com/ericcurtin/lm-pull

But the one in this PR integrated with llama.cpp is much more useful, it actually runs the models 😄

ericcurtin · 2024-12-12T00:24:34Z

This is ready for re-review @slaren

ericcurtin · 2024-12-13T12:50:30Z

Will probably show this tool at FOSDEM, I think the simplicity of it will appeal to people

ggerganov

The model fetching logic with libcurl is nice and should be promoted to libcommon and used everywhere we specify models filenames.

examples/run/CMakeLists.txt

ericcurtin · 2024-12-13T16:10:18Z

I made some further changes to the progress bar logic to eliminate flickering (just use one print call per progress bar update essentially) and I added some further progress bar info, so it now looks like this:

$ llama-run smollm:135m
13% |██                  | 12.12 MB/87.48 MB  3.07 MB/s  24s

ericcurtin · 2024-12-13T16:10:31Z

I think this should be good for merge now

Added support for positional arguments `model` and `prompt`. Added functionality to download via strings like: llama-run llama3 llama-run ollama://granite-code llama-run ollama://granite-code:8b llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf llama-run https://example.com/some-file1.gguf llama-run some-file2.gguf llama-run file://some-file3.gguf Signed-off-by: Eric Curtin <[email protected]>

slaren · 2024-12-13T18:33:58Z

Some things to improve:

The command line parser probably shouldn't ignore parameters that start with -, for example:

$ build/bin/llama-run -ngl 100 llama3
curl_easy_perform() failed: HTTP response code said error
terminate called after throwing an instance of 'nlohmann::json_abi_v3_11_3::detail::parse_error'
  what():  [json.exception.parse_error.101] parse error at line 1, column 1: attempting to parse an empty input; check that your input string or stream contains the expected JSON
fish: Job 1, 'build/bin/llama-run -ngl 100 ll…' terminated by signal SIGABRT (Abort)

The exceptions and error responses from the server could be handled more gracefully
The file storage will probably need some kind of managed cache rather than just storing a file without extension to the current directory
I couldn't build with MSVC, I think as it is curl is not directly available to llama-run, since it is private to common. It might be necessary to re-add the find_package(CURL) to the llama-run CMakeLists.txt.

llama.cpp\examples\run\run.cpp(8): fatal error C1083: Cannot open include file: 'curl/curl.h': No such file or directory

Added support for positional arguments `model` and `prompt`. Added functionality to download via strings like: llama-run llama3 llama-run ollama://granite-code llama-run ollama://granite-code:8b llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf llama-run https://example.com/some-file1.gguf llama-run some-file2.gguf llama-run file://some-file3.gguf Signed-off-by: Eric Curtin <[email protected]>

github-actions bot added the examples label Nov 26, 2024

ericcurtin force-pushed the new-style-run branch 8 times, most recently from 1454a52 to 39fa786 Compare November 26, 2024 23:58

ericcurtin commented Nov 26, 2024

View reviewed changes

examples/run/run.cpp Show resolved Hide resolved

ericcurtin force-pushed the new-style-run branch 2 times, most recently from fa3ff4f to be9cf7c Compare November 27, 2024 15:06

ericcurtin commented Nov 28, 2024

View reviewed changes

examples/run/run.cpp Show resolved Hide resolved

ericcurtin force-pushed the new-style-run branch from be9cf7c to de3784b Compare November 28, 2024 13:31

slaren reviewed Nov 29, 2024

View reviewed changes

examples/main/main.cpp Outdated Show resolved Hide resolved

ericcurtin force-pushed the new-style-run branch 4 times, most recently from a4bbad4 to 22d31da Compare December 9, 2024 12:43

ericcurtin force-pushed the new-style-run branch 2 times, most recently from 982cb52 to 2cb740a Compare December 10, 2024 12:42

ericcurtin force-pushed the new-style-run branch 6 times, most recently from 8bdb9fd to 3e16ec1 Compare December 10, 2024 15:44

slaren reviewed Dec 11, 2024

View reviewed changes

ericcurtin force-pushed the new-style-run branch 4 times, most recently from 642bd57 to 9d6debe Compare December 12, 2024 00:24

ericcurtin force-pushed the new-style-run branch 4 times, most recently from af36f34 to f05377a Compare December 12, 2024 11:31

ggerganov approved these changes Dec 13, 2024

View reviewed changes

ggerganov reviewed Dec 13, 2024

View reviewed changes

examples/run/CMakeLists.txt Outdated Show resolved Hide resolved

slaren approved these changes Dec 13, 2024

View reviewed changes

ericcurtin force-pushed the new-style-run branch 4 times, most recently from d0eed57 to e5d949e Compare December 13, 2024 16:08

ericcurtin force-pushed the new-style-run branch from e5d949e to e77c3f1 Compare December 13, 2024 16:29

ericcurtin force-pushed the new-style-run branch from e77c3f1 to 4710c27 Compare December 13, 2024 17:26

slaren merged commit c27ac67 into ggerganov:master Dec 13, 2024
47 checks passed

ericcurtin deleted the new-style-run branch December 13, 2024 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt class for positional argument handling #10508

Opt class for positional argument handling #10508

ericcurtin commented Nov 26, 2024 •

edited

Loading

ericcurtin commented Nov 28, 2024

ericcurtin commented Dec 9, 2024 •

edited

Loading

ericcurtin commented Dec 10, 2024 •

edited

Loading

slaren left a comment

ericcurtin commented Dec 11, 2024 •

edited

Loading

ericcurtin commented Dec 11, 2024 •

edited

Loading

ericcurtin commented Dec 12, 2024

ericcurtin commented Dec 13, 2024

ggerganov left a comment

ericcurtin commented Dec 13, 2024

ericcurtin commented Dec 13, 2024

slaren commented Dec 13, 2024

Opt class for positional argument handling #10508

Opt class for positional argument handling #10508

Conversation

ericcurtin commented Nov 26, 2024 • edited Loading

ericcurtin commented Nov 28, 2024

ericcurtin commented Dec 9, 2024 • edited Loading

ericcurtin commented Dec 10, 2024 • edited Loading

slaren left a comment

Choose a reason for hiding this comment

ericcurtin commented Dec 11, 2024 • edited Loading

ericcurtin commented Dec 11, 2024 • edited Loading

ericcurtin commented Dec 12, 2024

ericcurtin commented Dec 13, 2024

ggerganov left a comment

Choose a reason for hiding this comment

ericcurtin commented Dec 13, 2024

ericcurtin commented Dec 13, 2024

slaren commented Dec 13, 2024

ericcurtin commented Nov 26, 2024 •

edited

Loading

ericcurtin commented Dec 9, 2024 •

edited

Loading

ericcurtin commented Dec 10, 2024 •

edited

Loading

ericcurtin commented Dec 11, 2024 •

edited

Loading

ericcurtin commented Dec 11, 2024 •

edited

Loading