Support for function calling and media #173
Replies: 6 comments 3 replies
-
@creatorsagi thanks for posting this thread! I think 1. makes a lot of sense, we need to pose the problem in a way that we can then implement a solution for it. It would certainly be very useful to automatically generate a snippet of PDL code from LiteLLM docs. We will get back to you regarding 2 and 3. |
Beta Was this translation helpful? Give feedback.
-
@creatorsagi that's already possible. Anything you pass through the parameters gets passed to litellm. So for example, for function calling, one can pass |
Beta Was this translation helpful? Give feedback.
-
I am trying to make the function_calling examples work as well! Trying to figure out if it's an issue with the way we are calling, the PDL interpreter, or a LiteLLM issue... |
Beta Was this translation helpful? Give feedback.
-
@creatorsagi Please see here for an update on function calling. |
Beta Was this translation helpful? Give feedback.
-
@creatorsagi Here is a complete example of function calling that works with the latest main.
|
Beta Was this translation helpful? Give feedback.
-
@creatorsagi Here is an example of using an audio model:
|
Beta Was this translation helpful? Give feedback.
-
First of all, I think PDL is a beautifully designed framework with tremendous potential. Congratulations.
I've been working out scripts to exercise various capabilities of LLMs. It's a bit of trail and error because there aren't examples and documentation for many of these capabilities, so I look at the LiteLLM call and see how to write PDL that will invoke the right call. So, I have the following questions:
If LiteLLM supports it, can the PDL for that be generated? LiteLLM moves fast to support new capabilities - if these can be easily leveraged in PDL programs it would help enormously. Any HowTo guidance will be highly appreciated.
Consider Function calling for example with OpenAI. How can this be done in PDL? See the LiteLLM doc here: https://docs.litellm.ai/docs/completion/function_call
Is there any way to leverage Audio models like these: https://docs.litellm.ai/docs/completion/audio
It would be awesome if you can help with a tutorial/examples of how to pass on. I can also write patching code, so if I need to do some coding or contribute to the repo, I'm happy to do that also.
BTW, I was trying to figure out how to pass image URLs to gpt-4o earlier and while it took some testing, it turned out to be simple in the end. Debugging even small errors is hard at this point because the error logs aren't informative, but as I progress up the learning curve it should be easier:
This is a simple example of a vision model that can describe images. Tested to work as of 11/6/2024.
description: Vision model example
text:
input:
array:
content: You are a helpful assistant that is fluent in English and can describe images.
content: |
{"type": "text", "text": "What’s are all these images? Describe each one"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
},
},
{
"type": "image_url",
"image_url": {
"url": "https://replicate.delivery/mgxm/f4e50a7b-e8ca-432f-8e68-082034ebcc70/demo.jpg",
},
}
parameters:
max_tokens: 1000
temperature: 0.7
Beta Was this translation helpful? Give feedback.
All reactions