Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for clipboard processing from images #97

Merged
merged 10 commits into from
Jul 19, 2024
3 changes: 1 addition & 2 deletions GPT/beta-commands/beta-gpt.talon
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,8 @@ model find <user.text>: user.gpt_find_talon_commands(user.text)

# Using the context of the text on the clipboard, update the selected text
model blend clip:
clipboard_text = clip.text()
destination_text = edit.selected_text()
result = user.gpt_blend(clipboard_text, destination_text)
result = user.gpt_blend(user.gpt_get_source_text("clipboard"), destination_text)
user.gpt_insert_response(result, "")

# Pass the raw text of a prompt to a destination without actually calling GPT with it
Expand Down
19 changes: 16 additions & 3 deletions GPT/gpt.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,16 @@ def gpt_get_source_text(spoken_text: str) -> str:
"""Get the source text that is will have the prompt applied to it"""
match spoken_text:
case "clipboard":
return clip.text()
clipboard_text = clip.text()
if clipboard_text is None:
if clip.image():
return "__IMAGE__"
else:
notify(
"GPT Failure: User applied a prompt to the phrase clipboard, but there was no clipboard text or image stored"
)
return
return clipboard_text
case "gptResponse":
if GPTState.last_response == "":
raise Exception(
Expand All @@ -258,7 +267,11 @@ def gpt_get_source_text(spoken_text: str) -> str:
actions.user.clear_last_phrase()
return last_output
else:
notify("No text to reformat")
raise Exception("No text to reformat")
notify(
"GPT Failure: User applied a prompt to the phrase last Talon Dictation, but there was no text to reformat"
)
raise Exception(
"GPT Failure: User applied a prompt to the phrase last Talon Dictation, but there was no text to reformat"
)
case "this" | _:
return actions.edit.selected_text()
14 changes: 7 additions & 7 deletions GPT/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,10 @@ To add additional prompts, copy the [Talon list for custom prompts](lists/custom

If you wish to change any configuration settings, copy the [example configuration file](../talon-ai-settings.talon.example) into your user directory and modify settings that you want to change.

| Setting | Default | Notes |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------- |
| user.openai_model | `"gpt-3.5-turbo"` | The model to use for the queries. NOTE: To access gpt-4 you may need prior API use |
| user.model_temperature | `0.6` | Higher temperatures will make the model more creative and less accurate |
| user.model_endpoint | `"https://api.openai.com/v1/chat/completions"` | Any OpenAI compatible endpoint address can be used (Azure, local llamafiles, etc) |
| user.model_shell_default | `"bash"` | The default shell for `model shell` commands |
| user.model_system_prompt | `"You are an assistant helping an office worker to be more productive. Output just the response to the request and no additional content. Do not generate any markdown formatting such as backticks for programming languages unless it is explicitly requested."` | The meta-prompt for how to respond to all prompts |
| Setting | Default | Notes |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------- |
| user.openai_model | `"gpt-4o-mini"` | The model to use for the queries. NOTE: To access certain models you may need prior API use |
| user.model_temperature | `0.6` | Higher temperatures will make the model more creative and less accurate |
| user.model_endpoint | `"https://api.openai.com/v1/chat/completions"` | Any OpenAI compatible endpoint address can be used (Azure, local llamafiles, etc) |
| user.model_shell_default | `"bash"` | The default shell for `model shell` commands |
| user.model_system_prompt | `"You are an assistant helping an office worker to be more productive. Output just the response to the request and no additional content. Do not generate any markdown formatting such as backticks for programming languages unless it is explicitly requested."` | The meta-prompt for how to respond to all prompts |
15 changes: 12 additions & 3 deletions lib/modelHelpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,22 +53,31 @@ def generate_payload(
"Authorization": f"Bearer {TOKEN}",
}

message = {"type": "text", "text": content}
if content == "__IMAGE__":
clipped_image = clip.image()
if clipped_image:
data = clipped_image.encode().data()
base64_image = base64.b64encode(data).decode("utf-8")
message = {
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{base64_image}"},
}

data = {
"messages": [
{
"role": "system",
"content": settings.get("user.model_system_prompt")
+ additional_context,
},
{"role": "user", "content": f"{prompt}:\n{content}"},
{"role": "user", "content": [{"type": "text", "text": prompt}, message]},
],
"max_tokens": 2024,
"temperature": settings.get("user.model_temperature"),
"n": 1,
"stop": None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was causing an exception when running with an image prompt but not with text. I didn't notice any behavioral difference however so I removed it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good w me if it works with you

"model": settings.get("user.openai_model"),
}

if tools is not None:
data["tools"] = tools

Expand Down
4 changes: 3 additions & 1 deletion lib/talonSettings.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,9 @@ def modelPrompt(matched_prompt) -> str:


mod.setting(
"openai_model", type=Literal["gpt-3.5-turbo", "gpt-4"], default="gpt-3.5-turbo"
"openai_model",
type=Literal["gpt-3.5-turbo", "gpt-4", "gpt-4o-mini"],
default="gpt-4o-mini",
)

mod.setting(
Expand Down
2 changes: 1 addition & 1 deletion readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ This functionality is especially helpful for users who:
**Prompts and extends the following tools:**

- Github Copilot
- OpenAI API (GPT-3.5/GPT-4) for text generation and processing
- OpenAI API (with any GPT model) for text generation and processing
- Any OpenAI compatible model endpoint can be used (Azure, local llamafiles, etc)
- OpenAI API for image generation and vision

Expand Down
3 changes: 1 addition & 2 deletions talon-ai-settings.talon.example
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,7 @@ settings():

# user.model_system_prompt = "You are an assistant helping an office worker to be more productive."

# Change to 'gpt-4' for GPT-4
# NOTE, you may not have access to GPT-4 yet: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4
# Change to 'gpt-4' or the model of your choice
# user.openai_model = 'gpt-3.5-turbo'

# Only uncomment the line below if you want experimental behavior to parse Talon files
Expand Down