Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search with Context Similarity #2

Merged
merged 73 commits into from
Oct 18, 2024

Conversation

shiv810
Copy link
Collaborator

@shiv810 shiv810 commented Oct 5, 2024

Resolves #50

  • Database Backfilling with the issue and the comments data.
  • Builds on the existing open PR @ubiquityos gpt command #1
  • New Adapters for voyageai and supabase
  • Updated Prompt for the OpenAI completions
  • Added Rerankers for reranking the similar search results
  • Similarity Search Functions for the DB
  • QA (Testing)
  • QA (Multiple Models)
  • Improve the Data Quality
  • Optimize the ReRanking and Retrieval Process
  • Optimize the existing issue retrieval and formatting

Results for Database fetching backfilling:

  • A total of 146 issues were identified.
  • A comprehensive total of 1,238 comments was collected, including comments from pull requests (PRs), PR reviews, and comments specifically related to the identified issues.
  • Embeddings were generated using Voyage AI for enhanced data analysis.
  • The data was then converted into CSV format and loaded into Supabase for further use.

@shiv810 shiv810 marked this pull request as ready for review October 12, 2024 03:41
@shiv810
Copy link
Collaborator Author

shiv810 commented Oct 12, 2024

QA:

Question Answering based on Retrieval
Task Explain and Code Parsing
Does not hallucinate and create information

Models Used:

  • Claude 3.5 Sonnet: This model performed well with context lengths of up to 160K, showing a notable advantage in coding tasks and comprehension.
  • OpenAI o1-mini: In contrast to Sonnet, this model tended to hallucinate frequently when dealing with context lengths exceeding 100K.

@0x4007
Copy link
Member

0x4007 commented Oct 12, 2024

QA:

Question Answering based on Retrieval

Task Explain and Code Parsing

Does not hallucinate and create information

Models Used:

  • Claude 3.5 Sonnet: This model performed well with context lengths of up to 160K, showing a notable advantage in coding tasks and comprehension.

This aligns with my expectations. Claude is really good at dealing with fine grained comprehension and working with code. I use it as my primary model over ChatGPT inside of my cursor IDE.

  • OpenAI o1-mini: In contrast to Sonnet, this model tended to hallucinate frequently when dealing with context lengths exceeding 100K.

I haven't done extensive testing regarding context lengths but I generally use o1 for higher level more complex tasks.

For example the most recent interesting use was when I bootstrapped both the "sync-configs-agent" tool as well as the "rpc-handler" tool in my github org.

I give a detailed prompt using my voice to provide context, and then after I'll paste in context that's relevant. I'll have o1 preview do it's thing and be like 80-90% to completion. I use Claude for the remainder.

I also know that mini has a larger usable context window but may be less capable than preview. So I would use it for "in between" tasks which I can pump in a ton of context from an already made codebase but still have large ish sweeping changes recommended.

References:

@0x4007
Copy link
Member

0x4007 commented Oct 15, 2024

Hows this coming along?

@shiv810
Copy link
Collaborator Author

shiv810 commented Oct 15, 2024

Hows this coming along?

I'm currently trying to adjust the prompt to include the verbose parameter (v = 1). I've experimented with various prompting techniques, including words like "brevity," which typically help in reducing verbosity.

However, none of these approaches seem to be effective with sonnet. The output either becomes too unimaginative, lacking creativity in using resources and context, or it fails to cut off entirely. The current version works well. I can merge it into main and think of a better prompt later.

@0x4007
Copy link
Member

0x4007 commented Oct 16, 2024

Okay merge and lets test.

Copy link
Member

@0x4007 0x4007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a pretty solid implementation

export const pluginSettingsSchema = T.Object({
model: T.String({ default: "o1-mini" }),
openAiBaseUrl: T.Optional(T.String()),
similarityThreshold: T.Number({ default: 0.1 }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain to me what the similarity threshold is for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarity levels for the similarity search with issues and comments range from 0 to 1 (Unit Normalized), where 0 indicates the best match and 1 represents the farthest or worst match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could benefit from clarifying that this is calculating the difference with subtraction, so closer to 0 difference means more similar. Or to make it more intuitive, maybe reverse it.

1 is the most similar and 0 is least similar, so you want 90% similarity threshold (0.9) thats a lot more intuitive for a config.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, Inverted the scale, the parameters would range from 0 to 1 instead now. If someone enters 0.9 it would mean 90% similar now.

src/types/gpt.ts Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to llm.d.ts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed File

src/types/github.ts Outdated Show resolved Hide resolved
src/types/env.ts Outdated
export const envSchema = T.Object({
OPENAI_API_KEY: T.String(),
UBIQUITY_OS_APP_NAME: T.String(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should default to "UbiquityOS"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the default value.

* @returns The content of the README file as a string.
*/
export async function pullReadmeFromRepoForIssue(params: FetchParams): Promise<string | undefined> {
let readme = undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let readme = undefined;
let readme;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed Initialization to undefined.

src/handlers/ask-gpt.ts Outdated Show resolved Hide resolved
src/handlers/ask-gpt.ts Outdated Show resolved Hide resolved
src/adapters/voyage/helpers/embedding.ts Show resolved Hide resolved
src/adapters/openai/helpers/completions.ts Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
model,
rerankedText,
formattedChat,
["typescript", "github", "cloudflare worker", "actions", "jest", "supabase", "openai"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Handling Ground Truths: They are indicating that the system uses “ground truths” — meaning predefined correct examples or comments that the system relies on for determining context. Even if the query (or comment) doesn’t provide enough context, the system tries not to make assumptions. For example, if the query asks about “types” in a code snippet without specifying a language, the system shouldn’t assume it’s referring to Python.

Hard coding these things is the wrong approach then. This needs to be dynamic in a new task.

if (!text) {
return "";
}
return text.replace(/[^a-zA-Z0-9\s]/g, "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You sure you want to remove formatting clues such as bullet point lists, and the syntax for images? You'll just be left with URLs

You're also removing the block quote indicator which certainly changes the meaning of the corpus (quoting somebody else doesn't mean you agree.)

This seems like the regex needs to be a lot more comprehensive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this because it was only used with issues and comments. The goal was to eliminate just the emojis, as I thought they caused the LLM to produce strange Unicode values in the results. I believe the newer models don’t have this issue either, so it’s unnecessary.

Copy link
Member

@0x4007 0x4007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You marked my comments as "resolved" but didn't implement the requested changes.

@0x4007 0x4007 mentioned this pull request Oct 17, 2024
tests/main.test.ts Outdated Show resolved Hide resolved
@shiv810
Copy link
Collaborator Author

shiv810 commented Oct 18, 2024

You marked my comments as "resolved" but didn't implement the requested changes.

Could you please clarify which changes I may have overlooked?

if (answer && answer.content && res.usage) {
return { answer: answer.content, tokenUsage: { input: res.usage.prompt_tokens, output: res.usage.completion_tokens, total: res.usage.total_tokens } };
}
return { answer: "", tokenUsage: { input: 0, output: 0, total: 0 } };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning an empty string always seems like a bad idea. This seems to make more sense to throw an error

Copy link
Collaborator Author

@shiv810 shiv810 Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It throws an error at the UI level, displaying the message No answer from OpenAI. Sample

export interface CommentType {
id: string;
plaintext: string;
markdown?: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional seems wrong unless its an optimization to save tokens

query_text: query,
query_embedding: embedding,
threshold: threshold,
max_results: 10,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ten optimal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are ten issues and ten comments. Voyage AI performs excellently in this regard, consistently providing relevant issues. I believe ten is sufficient given the extensive local context.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are ten issues and ten comments. Voyage AI performs excellently in this regard, consistently providing relevant issues. I believe ten is sufficient given the extensive local context.

}

/**
* Asks GPT a question and returns the completions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to find and replace all GPT instances in the code base with LLM

model,
rerankedText,
formattedChat,
["typescript", "github", "cloudflare worker", "actions", "jest", "supabase", "openai"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case we haven't already: we should make another task for dynamic ground truths

const links: string[] = [];
inputString = inputString.replace(/https?:\/\/\S+/g, (match) => {
links.push(match);
return `__LINK${links.length - 1}__`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong but i dont know the full context of how its used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It removes duplicate sentences and phrases from the context and works reasonably well, fitting nearly ~250K of context in o1-mini. However, a downside is the loss of context regarding references, as it retains links and only some punctuation.

* @param params - The parameters required to fetch the README, including the context with octokit instance.
* @returns The content of the README file as a string.
*/
export async function pullReadmeFromRepoForIssue(params: FetchParams): Promise<string | undefined> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixed feelings on this. They fall out of date so fast. Its useful reference but might be worth warning the LLM that theres a good chance that it is out of date information.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They typically offer useful context about a repository, even if it's slightly outdated. This information can help users with their queries and provide some setup guidance.

import { Context } from "./types";
import { askQuestion } from "./handlers/ask-llm";
import { addCommentToIssue } from "./handlers/add-comment";
import { LogLevel, LogReturn, Logs } from "@ubiquity-dao/ubiquibot-logger";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this package still work? I thought we deleted it and rebranded to something like

@ubiquity-os/ubiquity-os-logger

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was installed before the purge. Will update this to the new logger version.

@0x4007 0x4007 merged commit e63b9ec into ubiquity-os-marketplace:development Oct 18, 2024
2 checks passed
@0x4007
Copy link
Member

0x4007 commented Oct 18, 2024

My last batch of comments is intended to be handled async because the pull is good enough to test in beta with

@shiv810
Copy link
Collaborator Author

shiv810 commented Oct 18, 2024

@0x4007 This is the config I used

plugins:
  - name: test-app
    id: test-app
    uses:
      - plugin: http://localhost:5000
        runsOn: ["issue_comment.created"]
        with: 
          model: "openai/o1-mini"
          openAiBaseUrl: "https://openrouter.ai/api/v1"

Locally the env file (.dev.vars) has to be configured with:

OPENAI_API_KEY=""
SUPABASE_URL=""
SUPABASE_KEY=""
VOYAGEAI_API_KEY=""

I can deploy it on the workers using my credentials if necessary.

@gentlementlegen
Copy link
Member

gentlementlegen commented Oct 18, 2024

@sshivaditya2019 For the Supabase, does it need a brand new instance or does it share with https://github.com/ubiquity-os-marketplace/text-vector-embeddings/ ? Also the deployment script does not upload the VOYAGEAI_API_KEY to the worker, which I think is wanted.

@shiv810
Copy link
Collaborator Author

shiv810 commented Oct 18, 2024

It utilizes the same database as text-vector-embeddings. I’ll make the necessary updates to the deployment script, but other than that, it should be a simple worker deployment. If you need the VOYAGEAI_API_KEY, I can send it through Telegram or another method. I believe it would be better to use my Supabase, as I have already backfilled the issues and comments. I can provide the CSVs for that if you’d like to set up your Supabase with it.

@gentlementlegen
Copy link
Member

@sshivaditya2019 Sounds good, please poke me in telegram (@the_mentlegen)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search
5 participants