-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding support for baidu qianfan and Ernie #823
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing! Work is currently under way (by @Harsha-Nori in #820) to remove the chat/completion distinction. You might want to check that out. |
Sure, and may I ask will this PR be merged after related changes have been done? Or I have to wait #820 being ready? |
@riedgar-ms Hi, PTAL after refactoring guided by #820 |
timeout=0.5, | ||
compute_log_probs=False, | ||
is_chat_model=True, | ||
**kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing that credentials go into the **kwargs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but normally It's passed through environment variable or .env
.
Thank you for updating! @Harsha-Nori what do you think? It's nice to have another model, but right now this is another model which we can't test regularly. |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #823 +/- ##
==========================================
+ Coverage 56.45% 60.10% +3.64%
==========================================
Files 63 64 +1
Lines 4793 4878 +85
==========================================
+ Hits 2706 2932 +226
+ Misses 2087 1946 -141 ☔ View full report in Codecov by Sentry. |
I think ultimately we should strive to include support for as many models as we can, but we should find some way to document which ones we're able to regularly test and maintain ourselves. We can welcome Issues/PRs on the ones we don't have the ability to regularly test ourselves of course. |
Thanks for the contribution @Dobiichi-Origami ! This all looks fine to me, with the caveat that I can't test it myself at the moment :). |
Fair enough @Harsha-Nori @Dobiichi-Origami , to fix the mypy error, see: |
Oh, maybe I can make an unit test later :) |
Alright I will fix that immediately |
@riedgar-ms @Harsha-Nori PTAL :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the caveat that I have no actual means of testing this, it looks OK to me.
Thx, and is there anyone to merge this PR? @Harsha-Nori |
@Dobiichi-Origami , you might want to make sure this still works after my merge of #894 Otherwise, I think @Harsha-Nori will be happy for me to merge it. |
Calling `guidance.json` with an empty schema generates arbitrary JSON. This closes guidance-ai#887 -- to quote @wjn0, there are several motivations for this: - APIs such as OpenAI allow users to request only valid JSON be generated sans schema, so in some sense this would give feature parity for local LLMs. - Large JSON schemas often include "arbitrary" properties, e.g. properties that are allowed to be any valid JSON value: https://json-schema.org/understanding-json-schema/basics#hello-world!
The latest release of llama-cpp-python (0.2.79) is causing issues with one of our tests (on Windows). The test in question is `test_repeat_calls`, and assumes that at T=0 (the default for `gen()`), then we can repeatedly call a LlamaCpp model and get the same result. This isn't happening, although stepping through the test itself with a debugger, I don't see anything untoward (I'm not seeing a pile up of previous prompts for example). For now, exclude the latest llama-cpp-python version, but we may want to revisit this test if the problem persists.
It appears that the Mistral chat template has [had an update](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/commit/1296dc8fd9b21e6424c9c305c06db9ae60c03ace), so we need to match this
@Dobiichi-Origami can you confirm that this functionality is still working after #894? If so, I'd be happy to merge |
I add model support of Ernie series for Guidance and user is now able to use Ernie series models from Baidu.