Support for function calling and media #173

creatorsagi · 2024-11-07T15:51:21Z

creatorsagi
Nov 7, 2024

First of all, I think PDL is a beautifully designed framework with tremendous potential. Congratulations.
I've been working out scripts to exercise various capabilities of LLMs. It's a bit of trail and error because there aren't examples and documentation for many of these capabilities, so I look at the LiteLLM call and see how to write PDL that will invoke the right call. So, I have the following questions:

If LiteLLM supports it, can the PDL for that be generated? LiteLLM moves fast to support new capabilities - if these can be easily leveraged in PDL programs it would help enormously. Any HowTo guidance will be highly appreciated.
Consider Function calling for example with OpenAI. How can this be done in PDL? See the LiteLLM doc here: https://docs.litellm.ai/docs/completion/function_call
Is there any way to leverage Audio models like these: https://docs.litellm.ai/docs/completion/audio

It would be awesome if you can help with a tutorial/examples of how to pass on. I can also write patching code, so if I need to do some coding or contribute to the repo, I'm happy to do that also.

BTW, I was trying to figure out how to pass image URLs to gpt-4o earlier and while it took some testing, it turned out to be simple in the end. Debugging even small errors is hard at this point because the error logs aren't informative, but as I progress up the learning curve it should be easier:

This is a simple example of a vision model that can describe images. Tested to work as of 11/6/2024.

description: Vision model example
text:

model: openai/gpt-4o-mini
input:
array:
- role: system
  content: You are a helpful assistant that is fluent in English and can describe images.
- role: user
  content: |
  {"type": "text", "text": "What’s are all these images? Describe each one"},
  {
  "type": "image_url",
  "image_url": {
  "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
  },
  },
  {
  "type": "image_url",
  "image_url": {
  "url": "https://replicate.delivery/mgxm/f4e50a7b-e8ca-432f-8e68-082034ebcc70/demo.jpg",
  },
  }
  parameters:
  max_tokens: 1000
  temperature: 0.7

vazirim · 2024-11-07T18:02:21Z

vazirim
Nov 7, 2024
Maintainer

@creatorsagi thanks for posting this thread! I think 1. makes a lot of sense, we need to pose the problem in a way that we can then implement a solution for it. It would certainly be very useful to automatically generate a snippet of PDL code from LiteLLM docs.

We will get back to you regarding 2 and 3.

1 reply

creatorsagi Nov 8, 2024
Author

RE: "It would certainly be very useful to automatically generate a snippet of PDL code from LiteLLM docs."
Maybe we can do something relatively simple? If LiteLLM adds some new capability, just being able to pass to LiteLLM the required parameters would itself be helpful. For example, OpenAI Predicted Outputs was just released and LiteLLM supports it: https://docs.litellm.ai/docs/completion/predict_outputs
It would be great to be able to invoke these capabilities by just passing through like LiteLLM parameters block?

vazirim · 2024-11-08T10:58:43Z

vazirim
Nov 8, 2024
Maintainer

@creatorsagi that's already possible. Anything you pass through the parameters gets passed to litellm. So for example, for function calling, one can pass tools or functions in parameters. And similarly, prediction can be passed to litellm by including it in parameters. If you have tried that and it doesn't work please let us know.

2 replies

creatorsagi Nov 8, 2024
Author

Can you please share a working example of function calling by passing parameters through to LiteLLM? I've been iterating a lot with it and was unable to get it to work.
(Separately I've worked through most of the examples/ in this Github repo with either Gemini or OpenAI).

eloycoto Nov 8, 2024

I think that it's related to this:

#174

vazirim · 2024-11-08T18:44:04Z

vazirim
Nov 8, 2024
Maintainer

I am trying to make the function_calling examples work as well! Trying to figure out if it's an issue with the way we are calling, the PDL interpreter, or a LiteLLM issue...

0 replies

vazirim · 2024-11-11T11:17:48Z

vazirim
Nov 11, 2024
Maintainer

@creatorsagi Please see here for an update on function calling.

0 replies

vazirim · 2024-11-14T19:41:57Z

vazirim
Nov 14, 2024
Maintainer

@creatorsagi Here is a complete example of function calling that works with the latest main.

defs:
  get_current_weather:
    function:
      args: str
    return: 
      defs: 
        arguments:
          text: ${ args }
          parser: json 
      if: ${ arguments.location == "Tokyo"}
      then: The temperature in Tokyo is 20 celsius.
text:
  - "Could you tell me the weather in Tokyo please?\n"
  - role: assistant
    model: ollama/llama3.2:1b
    modelResponse: out
    parameters:
      temperature: 0
      tool_choice: auto
      tools:
      - type: function
        function: 
          name: get_current_weather
          description: Get the current weather in a given location
          parameters:
            type: object
            properties:
              location:
                type: string
                description: The city and state, e.g. San Francisco, CA
              unit:
                type: string
                enum:
                - celsius
                - fahrenheit
              required:
                - location
  - def: tool
    data: ${ out[0].choices[0].delta.tool_calls[0] }
    contribute: []
  - "\n"
  - call: ${ tool.function.name }
    def: functionRes
    contribute:
    - context: 
        value:
          - role: tool
            tool_call_id: ${ tool.id }
            name: ${ tool.function.name }
            content: ${ functionRes }
    args: 
      args: ${ tool.function.arguments }
  - model: ollama/llama3.2:1b
    parameters:
      temperature: 0

0 replies

vazirim · 2024-11-14T22:09:11Z

vazirim
Nov 14, 2024
Maintainer

@creatorsagi Here is an example of using an audio model:

text:
- Is a golden retriever a good family dog?
- model: gpt-4o-audio-preview
  contribute: []
  modelResponse: completion 
  parameters:
    modalities:
    - text 
    - audio 
    audio:
      voice: alloy
      format: wav
- lang: python
  code: 
    |
    import base64
    wav_bytes = base64.b64decode("${ completion.choices[0].message.audio.data }")
    with open("dog.wav", "wb") as f:
      f.write(wav_bytes)
    result = ""

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for function calling and media #173

{{title}}

Replies: 6 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Support for function calling and media #173

creatorsagi Nov 7, 2024

This is a simple example of a vision model that can describe images. Tested to work as of 11/6/2024.

Replies: 6 comments · 3 replies

vazirim Nov 7, 2024 Maintainer

creatorsagi Nov 8, 2024 Author

vazirim Nov 8, 2024 Maintainer

creatorsagi Nov 8, 2024 Author

eloycoto Nov 8, 2024

vazirim Nov 8, 2024 Maintainer

vazirim Nov 11, 2024 Maintainer

vazirim Nov 14, 2024 Maintainer

vazirim Nov 14, 2024 Maintainer

creatorsagi
Nov 7, 2024

Replies: 6 comments 3 replies

vazirim
Nov 7, 2024
Maintainer

creatorsagi Nov 8, 2024
Author

vazirim
Nov 8, 2024
Maintainer

creatorsagi Nov 8, 2024
Author

vazirim
Nov 8, 2024
Maintainer

vazirim
Nov 11, 2024
Maintainer

vazirim
Nov 14, 2024
Maintainer

vazirim
Nov 14, 2024
Maintainer