-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opt class for positional argument handling #10508
Conversation
1454a52
to
39fa786
Compare
fa3ff4f
to
be9cf7c
Compare
@slaren @ggerganov this is ready for review. The next PR after this will download a huggingface model if the model string starts with hf:// or huggingface:// like RamaLama using the pre-existing huggingface downloader code in llama.cpp One thing that could be better is the output from that code. huggingface-cli has a much nice progress bar, etc. (python kinda makes it easy). But one step at a time I guess 😊 |
be9cf7c
to
de3784b
Compare
a4bbad4
to
22d31da
Compare
This is good for re-review @slaren I can't figure out how to call this kinda code correctly:
so I left it out for now. |
982cb52
to
2cb740a
Compare
@slaren @ggerganov On merge of this, one can start a chatbot via:
examples don't really get much simpler than this, from a user perspective at least... |
8bdb9fd
to
3e16ec1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty cool. I would also appreciate some documentation about where the the model files are cached/stored, it's not very clear at the moment.
It's basically like a "curl -O" or a wget, it just downloads it in the current directory as modelname.partial and then when the download is complete it's renamed to just modelname (that helps identify whether something is fully downloaded or not). It would be nice to have a full modelstore like RamaLama has, but maybe thats overkill for now. I'll try and articulate that best I can in usage help, etc. |
This is practically the same code with minor differences: https://github.com/ericcurtin/lm-pull But the one in this PR integrated with llama.cpp is much more useful, it actually runs the models 😄 |
642bd57
to
9d6debe
Compare
This is ready for re-review @slaren |
af36f34
to
f05377a
Compare
Will probably show this tool at FOSDEM, I think the simplicity of it will appeal to people |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The model fetching logic with libcurl
is nice and should be promoted to libcommon
and used everywhere we specify models filenames.
d0eed57
to
e5d949e
Compare
I made some further changes to the progress bar logic to eliminate flickering (just use one print call per progress bar update essentially) and I added some further progress bar info, so it now looks like this:
|
I think this should be good for merge now |
e5d949e
to
e77c3f1
Compare
Added support for positional arguments `model` and `prompt`. Added functionality to download via strings like: llama-run llama3 llama-run ollama://granite-code llama-run ollama://granite-code:8b llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf llama-run https://example.com/some-file1.gguf llama-run some-file2.gguf llama-run file://some-file3.gguf Signed-off-by: Eric Curtin <[email protected]>
e77c3f1
to
4710c27
Compare
Some things to improve:
$ build/bin/llama-run -ngl 100 llama3
curl_easy_perform() failed: HTTP response code said error
terminate called after throwing an instance of 'nlohmann::json_abi_v3_11_3::detail::parse_error'
what(): [json.exception.parse_error.101] parse error at line 1, column 1: attempting to parse an empty input; check that your input string or stream contains the expected JSON
fish: Job 1, 'build/bin/llama-run -ngl 100 ll…' terminated by signal SIGABRT (Abort)
|
Added support for positional arguments `model` and `prompt`. Added functionality to download via strings like: llama-run llama3 llama-run ollama://granite-code llama-run ollama://granite-code:8b llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf llama-run https://example.com/some-file1.gguf llama-run some-file2.gguf llama-run file://some-file3.gguf Signed-off-by: Eric Curtin <[email protected]>
Added support for positional arguments `model` and `prompt`. Added functionality to download via strings like: llama-run llama3 llama-run ollama://granite-code llama-run ollama://granite-code:8b llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf llama-run https://example.com/some-file1.gguf llama-run some-file2.gguf llama-run file://some-file3.gguf Signed-off-by: Eric Curtin <[email protected]>
Opt class for positional argument handling
Added support for positional arguments
model
andprompt
. Addedfunctionality to download via strings like:
llama-run llama3
llama-run ollama://granite-code
llama-run ollama://granite-code:8b
llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf
llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf
llama-run https://example.com/some-file1.gguf
llama-run some-file2.gguf
llama-run file://some-file3.gguf