Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR: correct discrepancies with the old bot #55

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
101d176
fix: set specs relevance as 1
EresDev Jul 11, 2024
264869f
fix: improve specification comment identification
EresDev Jul 12, 2024
395d154
test: update relevance for evalulator output
EresDev Jul 12, 2024
63b9201
test: update relevance for github comment output
EresDev Jul 12, 2024
2e5c541
test: update relevance for permit generator output
EresDev Jul 12, 2024
725e060
chore: ignore test outputs formatting by prettier
EresDev Jul 12, 2024
b90e8c6
test: update relevance for reward split output
EresDev Jul 12, 2024
77ce558
chore: spell fix
EresDev Jul 12, 2024
a22defd
fix: include missing h5 and other html tags
EresDev Jul 15, 2024
1ca68d0
test: update expected output for missing html tags
EresDev Jul 15, 2024
8a5a3ef
fix: score as 0 for unlisted html tags
EresDev Jul 18, 2024
4e6ae91
test: update output for unlisted html tags
EresDev Jul 18, 2024
c6050d4
chore: format with prettier
EresDev Jul 22, 2024
0c063a8
feat: prompt openai for json response
EresDev Jul 22, 2024
cd5b4d0
Merge branch 'development' of https://github.com/ubiquibot/conversati…
EresDev Jul 22, 2024
708db4e
fix: set relevance as 1 on chatgpt failure
EresDev Jul 23, 2024
4689208
refactor: identify specification of issue
EresDev Jul 23, 2024
4d574b8
fix: set review comments relevance as 1
EresDev Jul 23, 2024
9f8815e
chore: fixed spells
EresDev Jul 23, 2024
16d6c61
feat: add relevance config for content evaluator
EresDev Jul 23, 2024
6f014f4
docs: add relevance config to readme
EresDev Jul 23, 2024
5f9cf62
refactor: rename variables & functions
EresDev Jul 24, 2024
292cca5
test: use correct expected output
EresDev Jul 24, 2024
087090e
docs: remove redundent info
EresDev Jul 30, 2024
64f2f57
refactor: rename variable type to commentType
EresDev Jul 30, 2024
7d6a75b
refactor: remove redundent typecast commentType
EresDev Jul 30, 2024
3835202
fix: stop evaluataion on openai failure
EresDev Jul 30, 2024
f468888
fix: use typebox to validate openai response
EresDev Jul 30, 2024
0eb37d3
Merge branch 'development' of https://github.com/ubiquibot/conversati…
EresDev Jul 30, 2024
453eac9
chore: log openai response
EresDev Jul 30, 2024
324234f
fix: replace console logs with logger
EresDev Jul 31, 2024
135105d
fix: use decimal correctly get correct floating point
EresDev Jul 31, 2024
8ea2713
fix: linked PRs are properly collected
gentlementlegen Aug 4, 2024
6b512c5
fix: the metadata is properly escaped to avoid html rendering
gentlementlegen Aug 4, 2024
7206411
fix: content preview is properly stripped down to 64 characters
gentlementlegen Aug 4, 2024
92cbfb1
chore: removed logs
gentlementlegen Aug 4, 2024
de942ee
chore: removed logs
gentlementlegen Aug 4, 2024
ee14b90
fix: comments are ignored for the final result
gentlementlegen Aug 4, 2024
238f4d0
chore: fix test split reward
gentlementlegen Aug 5, 2024
0917a07
chore: fix test github comment
gentlementlegen Aug 5, 2024
50a4380
Merge pull request #1 from gentlementlegen/fork/test
EresDev Aug 5, 2024
c0c5ce7
docs: give correct reason for default relevance
EresDev Aug 12, 2024
ac4b164
refactor: improve typebox types usage
EresDev Aug 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
output.html
tests/__mocks__/results/output-reward-split.html
106 changes: 58 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
# `@ubiquibot/conversation-rewards`

This is intended to be the proper implementation of comment incentives, based on our learnings from the first go-around.
As of 28 February: test driven development to aggregate all necessary information based on a URL to an issue.

As of 28 February: test driven development to aggregate all necessary information based on a URL to an issue.
- pass in closed as complete issue URL and receive all the timeline events and activities of all humans who helped close the issue as complete.
- most importantly: this can inherit bot authentication and link pull requests to issues in private repositories.
- pass in closed as complete issue URL and receive all the timeline events and activities of all humans who helped close the issue as complete.
- most importantly: this can inherit bot authentication and link pull requests to issues in private repositories.

Be sure to review all `*.test.*` files for implementation details.
Be sure to review all `*.test.*` files for implementation details.

## Data structure

Expand Down Expand Up @@ -36,8 +35,9 @@ Be sure to review all `*.test.*` files for implementation details.
},
"reward": 0.8,
"relevance": 0.5
}
}
}]
]
}
}
```
Expand All @@ -48,70 +48,80 @@ Reward formula: `((count * wordValue) * (score * formattingMultiplier) * n) * re

Here is a possible valid configuration to enable this plugin. See [these files](./src/configuration) for more details.


```yaml
plugin: ubiquibot/conversation-rewards
with:
evmNetworkId: 100
evmPrivateEncrypted: "encrypted-key"
erc20RewardToken: "0xe91D153E0b41518A2Ce8Dd3D7944Fa863463a97d"
dataCollection:
maxAttempts: 10
delayMs: 10000
incentives:
requirePriceLabel: true
contentEvaluator:
userExtractor:
redeemTask: true
dataPurge:
formattingEvaluator:
scores:
br: 0
code: 1
p: 1
em: 0
img: 0
strong: 0
blockquote: 0
h1: 1
h2: 1
h3: 1
h4: 1
h5: 1
h6: 1
a: 1
li: 1
td: 1
hr: 0
evmNetworkId: 100
evmPrivateEncrypted: "encrypted-key"
erc20RewardToken: "0xe91D153E0b41518A2Ce8Dd3D7944Fa863463a97d"
dataCollection:
maxAttempts: 10
delayMs: 10000
incentives:
requirePriceLabel: true
contentEvaluator:
multipliers:
- select: [ISSUE_SPECIFICATION]
relevance: 1
- select: [PULL_AUTHOR]
relevance: 1
- select: [PULL_ASSIGNEE]
relevance: 1
- select: [PULL_COLLABORATOR]
relevance: 1
- select: [PULL_CONTRIBUTOR]
relevance: 1
Comment on lines +63 to +73
Copy link
Member

@0x4007 0x4007 Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sets these to relevance 1? This is not clear to me.

Copy link
Contributor Author

@EresDev EresDev Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirement was to set the relevance of issue specifications, and PR comments to 1. To implement that I added something called "Fixed Relevance" and you can specify it in config. You can add fixed relevance to any comment type, and if you do so, fixed relevance will take precedence and that comment type will not be sent to OpenAI for evaluation, and the fixed relevance will be applied to that comment type. What you see in the readme file above are fixed relevance of 1 being applied to issue specifications and PR comments.

userExtractor:
redeemTask: true
Comment on lines +74 to +75
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this? We should just make it a single property without nesting probably.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nesting is die to userExtractor being its own module and made it clearer that this module is the only one using these variables.
redeemTask means the tasks are redeemable e.g. the reward can be collected.

dataPurge:
formattingEvaluator:
scores:
br: 0
code: 1
p: 1
em: 0
img: 0
strong: 0
blockquote: 0
h1: 1
h2: 1
h3: 1
h4: 1
h5: 1
h6: 1
a: 1
li: 1
td: 1
hr: 0
multipliers:
- select: [ ISSUE_SPECIFICATION ]
- select: [ISSUE_SPECIFICATION]
formattingMultiplier: 1
wordValue: 0.1
- select: [ ISSUE_AUTHOR ]
- select: [ISSUE_AUTHOR]
formattingMultiplier: 1
wordValue: 0.2
- select: [ ISSUE_ASSIGNEE ]
- select: [ISSUE_ASSIGNEE]
formattingMultiplier: 0
wordValue: 0
- select: [ ISSUE_COLLABORATOR ]
- select: [ISSUE_COLLABORATOR]
formattingMultiplier: 1
wordValue: 0.1
- select: [ ISSUE_CONTRIBUTOR ]
- select: [ISSUE_CONTRIBUTOR]
formattingMultiplier: 0.25
wordValue: 0.1
- select: [ PULL_SPECIFICATION ]
- select: [PULL_SPECIFICATION]
formattingMultiplier: 0
wordValue: 0
- select: [ PULL_AUTHOR ]
- select: [PULL_AUTHOR]
formattingMultiplier: 2
wordValue: 0.2
- select: [ PULL_ASSIGNEE ]
- select: [PULL_ASSIGNEE]
formattingMultiplier: 1
wordValue: 0.1
- select: [ PULL_COLLABORATOR ]
- select: [PULL_COLLABORATOR]
formattingMultiplier: 1
wordValue: 0.1
- select: [ PULL_CONTRIBUTOR ]
- select: [PULL_CONTRIBUTOR]
formattingMultiplier: 0.25
wordValue: 0.1
permitGeneration:
Expand Down
37 changes: 36 additions & 1 deletion src/configuration/content-evaluator-config.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,40 @@
import { Static, Type } from "@sinclair/typebox";
import { commentType } from "./formatting-evaluator-config";

export const contentEvaluatorConfigurationType = Type.Object({});
export const contentEvaluatorConfigurationType = Type.Object({
/**
* Multipliers applied to different types of comments
*/
multipliers: Type.Array(
Type.Object({
select: Type.Array(commentType),
relevance: Type.Optional(Type.Number()),
}),
{
default: [
{
select: ["ISSUE_SPECIFICATION"],
relevance: 1,
},
{
select: ["PULL_AUTHOR"],
relevance: 1,
},
{
select: ["PULL_ASSIGNEE"],
relevance: 1,
},
{
select: ["PULL_COLLABORATOR"],
relevance: 1,
},
{
select: ["PULL_CONTRIBUTOR"],
relevance: 1,
},
],
}
),
});

export type ContentEvaluatorConfiguration = Static<typeof contentEvaluatorConfigurationType>;
4 changes: 2 additions & 2 deletions src/configuration/formatting-evaluator-config.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { Static, Type } from "@sinclair/typebox";
import { CommentAssociation, CommentKind, CommentType } from "./comment-types";

const type = Type.Union(
export const commentType = Type.Union(
Object.keys(CommentKind).flatMap((kind) =>
Object.keys(CommentAssociation).map((association) => Type.Literal(`${kind}_${association}` as CommentType))
)
Expand All @@ -13,7 +13,7 @@ export const formattingEvaluatorConfigurationType = Type.Object({
*/
multipliers: Type.Array(
Type.Object({
select: Type.Array(type),
select: Type.Array(commentType),
formattingMultiplier: Type.Number(),
wordValue: Type.Number(),
}),
Expand Down
4 changes: 3 additions & 1 deletion src/data-collection/collect-linked-pulls.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ export async function collectLinkedMergedPulls(issue: IssueParams) {
// Works on multiple linked issues, and matches #<number> or URL patterns
const linkedIssueRegex =
/\b(?:Close(?:s|d)?|Fix(?:es|ed)?|Resolve(?:s|d)?):?\s+(?:#(\d+)|https?:\/\/(?:www\.)?github\.com\/(?:[^/\s]+\/[^/\s]+\/(?:issues|pull)\/(\d+)))\b/gi;
const linkedPrUrls = event.source.issue.body.match(linkedIssueRegex);
// We remove the comments as they should not be part of the linked pull requests
const linkedPrUrls = event.source.issue.body.replace(/<!--[\s\S]+-->/, "").match(linkedIssueRegex);
if (!linkedPrUrls) {
return false;
}
Expand All @@ -33,6 +34,7 @@ export async function collectLinkedMergedPulls(issue: IssueParams) {
linkedRepo.owner === issue.owner;
}
}
if (isClosingPr) break;
}
return isGitHubLinkEvent(event) && event.source.issue.pull_request?.merged_at && isClosingPr;
});
Expand Down
123 changes: 86 additions & 37 deletions src/parser/content-evaluator-module.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,37 @@ import {
import { IssueActivity } from "../issue-activity";
import { GithubCommentScore, Module, Result } from "./processor";
import { Value } from "@sinclair/typebox/value";
import { commentEnum, CommentType } from "../configuration/comment-types";
import logger from "../helpers/logger";
import openAiRelevanceResponseSchema, { RelevancesByOpenAi } from "../types/openai-type";

/**
* Evaluates and rates comments.
*/
export class ContentEvaluatorModule implements Module {
readonly _openAi = new OpenAI({ apiKey: OPENAI_API_KEY });
readonly _configuration: ContentEvaluatorConfiguration | null = configuration.incentives.contentEvaluator;
private readonly _fixedRelevances: { [k: string]: number } = {};

_getEnumValue(key: CommentType) {
let res = 0;

key.split("_").forEach((value) => {
res |= Number(commentEnum[value as keyof typeof commentEnum]);
});
return res;
}

constructor() {
if (this._configuration?.multipliers) {
this._fixedRelevances = this._configuration.multipliers.reduce((acc, curr) => {
return {
...acc,
[curr.select.reduce((a, b) => this._getEnumValue(b) | a, 0)]: curr.relevance,
};
}, {});
}
}

get enabled(): boolean {
if (!Value.Check(contentEvaluatorConfigurationType, this._configuration)) {
Expand Down Expand Up @@ -48,66 +72,91 @@ export class ContentEvaluatorModule implements Module {

async _processComment(comments: Readonly<GithubCommentScore>[], specificationBody: string) {
const commentsWithScore: GithubCommentScore[] = [...comments];
const commentsBody = commentsWithScore.map((comment) => comment.content);
const relevance = await this._evaluateComments(specificationBody, commentsBody);

if (relevance.length !== commentsWithScore.length) {
console.error("Relevance / Comment length mismatch! Skipping.");
return [];
// exclude comments that have fixed relevance multiplier. e.g. review comments = 1
const commentsToEvaluate: { id: number; comment: string }[] = [];
for (let i = 0; i < commentsWithScore.length; i++) {
const currentComment = commentsWithScore[i];
if (!this._fixedRelevances[currentComment.type]) {
commentsToEvaluate.push({
id: currentComment.id,
comment: currentComment.content,
});
}
}

for (let i = 0; i < relevance.length; i++) {
const relevancesByAI = await this._evaluateComments(specificationBody, commentsToEvaluate);

if (Object.keys(relevancesByAI).length !== commentsToEvaluate.length) {
console.error("Relevance / Comment length mismatch! \nWill use 1 as relevance for missing comments.");
}

for (let i = 0; i < commentsWithScore.length; i++) {
const currentComment = commentsWithScore[i];
const currentRelevance = relevance[i];
let currentRelevance = 1; // For comments not in fixed relevance types and missed by OpenAI evaluation

if (this._fixedRelevances[currentComment.type]) {
currentRelevance = this._fixedRelevances[currentComment.type];
} else if (!isNaN(relevancesByAI[currentComment.id])) {
currentRelevance = relevancesByAI[currentComment.id];
}

const currentReward = new Decimal(currentComment.score?.reward || 0);
currentComment.score = {
...(currentComment.score || {}),
relevance: currentRelevance.toNumber(),
relevance: new Decimal(currentRelevance).toNumber(),
reward: currentReward.mul(currentRelevance).toNumber(),
};
}

return commentsWithScore;
}

async _evaluateComments(specification: string, comments: string[]): Promise<Decimal[]> {
async _evaluateComments(
specification: string,
comments: { id: number; comment: string }[]
): Promise<RelevancesByOpenAi> {
const prompt = this._generatePrompt(specification, comments);

const response: OpenAI.Chat.ChatCompletion = await this._openAi.chat.completions.create({
model: "gpt-4o",
response_format: { type: "json_object" },
messages: [
{
role: "system",
content: prompt,
},
],
temperature: 1,
max_tokens: 128,
top_p: 1,
frequency_penalty: 0,
presence_penalty: 0,
});

const rawResponse = String(response.choices[0].message.content);
logger.info(`OpenAI raw response: ${rawResponse}`);

const jsonResponse = JSON.parse(rawResponse);

try {
const response: OpenAI.Chat.ChatCompletion = await this._openAi.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: prompt,
},
],
temperature: 1,
max_tokens: 128,
top_p: 1,
frequency_penalty: 0,
presence_penalty: 0,
});

const rawResponse = String(response.choices[0].message.content);
const parsedResponse = JSON.parse(rawResponse) as number[];
return parsedResponse.map((o) => new Decimal(o));
} catch (error) {
console.error(`Failed to evaluate comment`, error);
return [];
const relevances = Value.Decode(openAiRelevanceResponseSchema, jsonResponse);
logger.info(`Relevances by OpenAI: ${JSON.stringify(relevances)}`);
return relevances;
} catch (e) {
logger.error(`Invalid response type received from openai while evaluating: ${jsonResponse} \n\nError: ${e}`);
throw new Error("Error in evaluation by OpenAI.");
}
}

_generatePrompt(issue: string, comments: string[]) {
_generatePrompt(issue: string, comments: { id: number; comment: string }[]) {
if (!issue?.length) {
throw new Error("Issue specification comment is missing or empty");
}
return `I need to evaluate the relevance of GitHub contributors' comments to a specific issue specification. Specifically, I'm interested in how much each comment helps to further define the issue specification or contributes new information or research relevant to the issue. Please provide a float between 0 and 1 to represent the degree of relevance. A score of 1 indicates that the comment is entirely relevant and adds significant value to the issue, whereas a score of 0 indicates no relevance or added value. Each contributor's comment is on a new line.\n\nIssue Specification:\n\`\`\`\n${issue}\n\`\`\`\n\nConversation:\n\`\`\`\n${comments
.map((comment) => comment)
.join(
"\n"
)}\n\`\`\`\n\n\nTo what degree are each of the comments in the conversation relevant and valuable to further defining the issue specification? Please reply with ONLY an array of float numbers between 0 and 1, corresponding to each comment in the order they appear. Each float should represent the degree of relevance and added value of the comment to the issue. The total length of the array in your response should equal exactly ${
return `I need to evaluate the relevance of GitHub contributors' comments to a specific issue specification. Specifically, I'm interested in how much each comment helps to further define the issue specification or contributes new information or research relevant to the issue. Please provide a float between 0 and 1 to represent the degree of relevance. A score of 1 indicates that the comment is entirely relevant and adds significant value to the issue, whereas a score of 0 indicates no relevance or added value. A stringified JSON is given below that contains the specification and contributors' comments. Each comment in the JSON has a unique ID and comment content. \n\n\`\`\`\n${JSON.stringify(
{ specification: issue, comments: comments }
)}\n\`\`\`\n\n\nTo what degree are each of the comments in the conversation relevant and valuable to further defining the issue specification? Please reply with ONLY a JSON where each key is the comment ID given in JSON above, and the value is a float number between 0 and 1 corresponding to the comment. The float number should represent the degree of relevance and added value of the comment to the issue. The total number of properties in your JSON response should equal exactly ${
comments.length
} elements.`;
}.`;
}
}
Loading
Loading