-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: dynamic ground truths #14
feat: dynamic ground truths #14
Conversation
Unused types (1)
|
Marking as ready for review to get eyes on it and opinions as-is. Picked from e6586a4 Each array belongs to the review it performed on the QA PR, it also contains the actual review (I haven't started to refine the review prompt). Spec that it's sourcing truths from here As I said I don't have full context but if the truths are better sourced from something else let me know but this seems appropriate at least from the purpose of #11. The ones currently in use in [
'The bot should initiate review when a pull request is created as a draft and finalized by the contributor.',
'The bot should parse the issue specification and pull request diff to assess compliance.',
'If the pull request does not meet the specification, the bot should provide actionable feedback and change the review state to requested changes.',
'The bot should convert non-compliant pulls back to draft status if they fail the specification check.',
"The bot should only leave a 'commented' state for pulls that meet the specification.",
'If a collaborator re-finalizes a draft pull, the bot should stop further interventions.',
'The inspection process should be triggered only during initial creation and when a draft is finalized by the pull author.'
]
[
'The bot should verify that the pull request is initially opened as a draft.',
'The bot should check for changes from draft to finalized pull request status for initiating review.',
'The bot needs to check pull request diffs against the issue specification for compliance.',
'The bot should provide actionable feedback for specification discrepancies in the review.',
'If the pull request does not meet specifications, the bot should convert it back to draft and request changes.',
"If the pull request meets specifications, the bot should mark it as 'commented' without approval.",
'The bot must refrain from intervening if a collaborator changes the pull request back to finalized.',
'The bot’s intervention should be limited to triggers on pull creation and author-led status changes.',
'Optionally handle Continuous Integration (CI) checks separately due to external factors.',
'Consider implementing a daily limit on bot reviews per user to prevent abuse of the review system.'
]
[
'The contributor must initially open the pull request as a draft.',
'When the pull request is ready for review, the contributor should convert it to a finalized pull request.',
'The bot should analyze the issue specification along with the pull request diff.',
'The bot should provide actionable feedback indicating any missing specifications.',
"If the pull request doesn't meet the specification, the bot should require changes and revert the pull back to a draft.",
'If the pull request meets the specification, the bot should leave a comment without approval.',
'The bot must not intervene if a collaborator changes the pull request from draft to finalized.',
'The bot should only conduct inspections upon pull creation and when the author finalizes a draft.',
'Optional: Ensure CI passes, but account for potential external failures.',
'Optional: Limit bot reviews to one per day per contributor to prevent excessive use for minor changes.'
] |
@0x4007 @sshivaditya2019 @gentlementlegen @rndquu requesting review CI can be ignored as it's used in #11 but not here, or I can comment it out or something so it passes CI. |
Why are you adding redundant information in those arrays? |
I'm not manually adding anything it's GPT that's creating the array contents based on the spec and prompt, that's it. Without the context and add. info I requested here #13, I'm not 100% how to refine and improve inline with @sshivaditya2019' original intention for them, I know how I'd refine them personally but this is not my show. Right now GPT is consuming the task spec and creating these outputs based on this prompt and settings to completions endpoint |
the @ubqbot command right now that's in
Which doesn't make a whole lot of sense to me without the additional context. These are more like a classification of the subject areas of the tech stack involved in the query/task/org? If that's the true intention of "Ground Truths" then I know how to refactor. Or is how I'm using them the correct way to use them for my use-case? |
I think we should not have redundant messages but they should be more substantial than the keywords we have now. |
In my opinion "Ground Truths" should be considered in relation to the use-case if they are intended to guardrail the model to conform to a specific workflow, which we can consider different applications, i.e chatbot vs code review.
See this comment for my suggestion on dynamically generating the chatbot ground truths |
Dynamic chatbot ground truths QA used within my fork of this repo so it's pulling the deps and languages of this repo. [
{
role: 'system',
content: '\n' +
'Using the input provided, your goal is to produce an array of strings that represent "Ground Truths."\n' +
'These ground truths are high-level abstractions that encapsulate the tech stack and dependencies of the repository.\n' +
' \n' +
'Each ground truth should:\n' +
'- Be succinct and easy to understand.\n' +
'- Use only the information provided in the input.\n' +
'- Focus on essential requirements, behaviors, or assumptions involved in the repository.\n' +
' \n' +
'Example:\n' +
'Languages: { TypeScript: 60%, JavaScript: 15%, HTML: 10%, CSS: 5%, ... }\n' +
'Dependencies: Esbuild, Wrangler, React, Tailwind CSS, ms, React-carousel, React-icons, ...\n' +
'Dev Dependencies: @types/node, @types/jest, @mswjs, @testing-library/react, @testing-library/jest-dom, @Cypress ...\n' +
'Ground Truths:\n' +
'- The repo predominantly uses TypeScript, with JavaScript, HTML, and CSS also present.\n' +
'- The repo is a React project that uses Tailwind CSS.\n' +
'- The project is built with Esbuild and deployed with Wrangler, indicating a Cloudflare Workers project.\n' +
'- The repo tests use Jest, Cypress, mswjs, and React Testing Library.\n' +
' \n' +
'Conditions:\n' +
'Assume your output builds the foundation for a chatbot to understand the repository when asked an arbitrary query.\n' +
'Do not list every language or dependency, focus on the most prevalent ones.\n' +
'Focus on what is essential to understand the repository at a high level.\n' +
'Brevity is key. Use zero formatting. Do not wrap in quotes, backticks, or other characters.\n' +
'response === ["some", "array", "of", "strings"]\n' +
' \n' +
'Generate similar ground truths adhering to a maximum of 10.\n' +
' \n' +
'Return a JSON parsable array of strings representing the ground truths, without comment or directive.'
},
{
role: 'user',
content: '{"dependencies":{"@mswjs/data":"^0.16.2","@octokit/rest":"20.1.1","@octokit/webhooks":"13.2.7","@sinclair/typebox":"0.32.33","@supabase/supabase-js":"^2.45.4","@ubiquity-dao/ubiquibot-logger":"^1.3.0","dotenv":"^16.4.5","openai":"^4.63.0","typebox-validators":"0.3.5","voyageai":"^0.0.1-5"},"devDependencies":{"@actions/core":"^1.11.1","@actions/github":"^6.0.0","@commitlint/cli":"19.3.0","@commitlint/config-conventional":"19.2.2","@cspell/dict-node":"5.0.1","@cspell/dict-software-terms":"3.4.6","@cspell/dict-typescript":"3.1.5","@eslint/js":"9.5.0","@jest/globals":"29.7.0","@types/jest":"^29.5.12","@types/node":"20.14.5","cspell":"8.9.0","eslint":"9.5.0","eslint-config-prettier":"9.1.0","eslint-plugin-check-file":"2.8.0","eslint-plugin-prettier":"5.1.3","eslint-plugin-sonarjs":"1.0.3","husky":"9.0.11","jest":"29.7.0","jest-junit":"16.0.0","jest-md-dashboard":"0.8.0","knip":"5.21.2","lint-staged":"15.2.7","npm-run-all":"4.1.5","prettier":"3.3.2","ts-jest":"29.1.5","tsx":"4.15.6","typescript":"5.4.5","typescript-eslint":"7.13.1","wrangler":"^3.81.0"},"languages":[["TypeScript",0.9235672829913418],["PLpgSQL",0.03861807956191261],["JavaScript",0.03622889642996839],["Shell",0.00158574101677714]]}'
}
]
languages: [
[ 'TypeScript', 0.9235672829913418 ],
[ 'PLpgSQL', 0.03861807956191261 ],
[ 'JavaScript', 0.03622889642996839 ],
[ 'Shell', 0.00158574101677714 ]
]
Ground Truths: [
'The repository is primarily written in TypeScript, with some PLpgSQL and JavaScript code.',
'The project uses Supabase for backend services.',
'Integration with GitHub APIs is handled via Octokit.',
"The application leverages OpenAI's API for AI functionalities.",
'Jest is used as the testing framework, configured for TypeScript.',
'ESLint and Prettier are employed for code linting and formatting.',
'GitHub Actions manage the CI/CD workflows.',
'Husky and lint-staged are set up for pre-commit hooks.',
'The project is deployed using Wrangler, indicating deployment to Cloudflare Workers.',
'Commit messages are enforced using Commitlint with conventional commit standards.'
] [
{
role: 'system',
content: '\n' +
'Using the input provided, your goal is to produce an array of strings that represent "Ground Truths."\n' +
'These ground truths are high-level abstractions that encapsulate the tech stack and dependencies of the repository.\n' +
' \n' +
'Each ground truth should:\n' +
'- Be succinct and easy to understand.\n' +
'- Use only the information provided in the input.\n' +
'- Focus on essential requirements, behaviors, or assumptions involved in the repository.\n' +
' \n' +
'Example:\n' +
'Languages: { TypeScript: 60%, JavaScript: 15%, HTML: 10%, CSS: 5%, ... }\n' +
'Dependencies: Esbuild, Wrangler, React, Tailwind CSS, ms, React-carousel, React-icons, ...\n' +
'Dev Dependencies: @types/node, @types/jest, @mswjs, @testing-library/react, @testing-library/jest-dom, @Cypress ...\n' +
'Ground Truths:\n' +
'- The repo predominantly uses TypeScript, with JavaScript, HTML, and CSS also present.\n' +
'- The repo is a React project that uses Tailwind CSS.\n' +
'- The project is built with Esbuild and deployed with Wrangler, indicating a Cloudflare Workers project.\n' +
'- The repo tests use Jest, Cypress, mswjs, and React Testing Library.\n' +
' \n' +
'Conditions:\n' +
'Assume your output builds the foundation for a chatbot to understand the repository when asked an arbitrary query.\n' +
'Do not list every language or dependency, focus on the most prevalent ones.\n' +
'Focus on what is essential to understand the repository at a high level.\n' +
'Brevity is key. Use zero formatting. Do not wrap in quotes, backticks, or other characters.\n' +
'response === ["some", "array", "of", "strings"]\n' +
' \n' +
'Generate similar ground truths adhering to a maximum of 10.\n' +
' \n' +
'Return a JSON parsable array of strings representing the ground truths, without comment or directive.'
},
{
role: 'user',
content: '{"dependencies":{"@mswjs/data":"^0.16.2","@octokit/rest":"20.1.1","@octokit/webhooks":"13.2.7","@sinclair/typebox":"0.32.33","@supabase/supabase-js":"^2.45.4","@ubiquity-dao/ubiquibot-logger":"^1.3.0","dotenv":"^16.4.5","openai":"^4.63.0","typebox-validators":"0.3.5","voyageai":"^0.0.1-5"},"devDependencies":{"@actions/core":"^1.11.1","@actions/github":"^6.0.0","@commitlint/cli":"19.3.0","@commitlint/config-conventional":"19.2.2","@cspell/dict-node":"5.0.1","@cspell/dict-software-terms":"3.4.6","@cspell/dict-typescript":"3.1.5","@eslint/js":"9.5.0","@jest/globals":"29.7.0","@types/jest":"^29.5.12","@types/node":"20.14.5","cspell":"8.9.0","eslint":"9.5.0","eslint-config-prettier":"9.1.0","eslint-plugin-check-file":"2.8.0","eslint-plugin-prettier":"5.1.3","eslint-plugin-sonarjs":"1.0.3","husky":"9.0.11","jest":"29.7.0","jest-junit":"16.0.0","jest-md-dashboard":"0.8.0","knip":"5.21.2","lint-staged":"15.2.7","npm-run-all":"4.1.5","prettier":"3.3.2","ts-jest":"29.1.5","tsx":"4.15.6","typescript":"5.4.5","typescript-eslint":"7.13.1","wrangler":"^3.81.0"},"languages":[["TypeScript",0.9235672829913418],["PLpgSQL",0.03861807956191261],["JavaScript",0.03622889642996839],["Shell",0.00158574101677714]]}'
}
]
languages: [
[ 'TypeScript', 0.9235672829913418 ],
[ 'PLpgSQL', 0.03861807956191261 ],
[ 'JavaScript', 0.03622889642996839 ],
[ 'Shell', 0.00158574101677714 ]
]
Ground Truths: [
'The repository is primarily written in TypeScript with minor use of JavaScript and PLpgSQL.',
'It integrates with Supabase for backend services.',
'The project leverages OpenAI for AI functionalities.',
'Environment variables are managed using dotenv.',
'Deployment is handled with Wrangler, indicating Cloudflare Workers usage.',
'The development setup includes Jest for testing and ESLint for linting.',
'GitHub Actions are employed for continuous integration and deployment workflows.',
'Commit messages are standardized using Commitlint and enforced with Husky hooks.',
'The project uses @octokit libraries for GitHub API interactions and webhooks.',
'TypeScript is utilized with typebox for schema validation and type safety.'
] |
@gentlementlegen @0x4007 @sshivaditya2019 @rndquu Why can't we request reviews in this org lmao? Anyway this is ready for review team, thanks. |
future improvements:
with every usage of
|
@sshivaditya2019 you should review and decide when this pull is ready. I encourage QA for changes to prove they work and ideally you should also test as a reviewer |
lmk if there is anything holding back this PR and I'll push it forward |
I was working on setting this up in the repo, sorry for the delay. LGTM! |
Resolves #13