Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check overhead for comment evaluation #174

Open
gentlementlegen opened this issue Oct 28, 2024 · 6 comments · May be fixed by #225
Open

Check overhead for comment evaluation #174

gentlementlegen opened this issue Oct 28, 2024 · 6 comments · May be fixed by #225

Comments

@gentlementlegen
Copy link
Member

          > ```diff

! Failed to run comment evaluation. Error: 400 This model's maximum context length is 128000 tokens. However, your messages resulted in 148540 tokens. Please reduce the length of the messages.

<!--
https://github.com/ubiquity-os-marketplace/text-conversation-rewards/actions/runs/11463496789
{
  "status": 400,
  "headers": {
    "access-control-expose-headers": "X-Request-ID",
    "alt-svc": "h3=\":443\"; ma=86400",
    "cf-cache-status": "DYNAMIC",
    "cf-ray": "8d6a8635dd992009-IAD",
    "connection": "keep-alive",
    "content-length": "284",
    "content-type": "application/json",
    "date": "Tue, 22 Oct 2024 15:29:41 GMT",
    "openai-organization": "ubiquity-dao-8veapj",
    "openai-processing-ms": "375",
    "openai-version": "2020-10-01",
    "server": "cloudflare",
    "set-cookie": "__cf_bm=urRioyrKlQBCiRkxcgeZKjDpvmvjEQsjfq1o9zASCxs-1729610981-1.0.1.1-u3eEr.AKdcx2EGJuW2nauw6LA5zK0ZDXyOKJiCI01E_pfZOpnzWIJoxgLq_OlO8BDT_WFfSD_jFjjW6Fnmx_Mw; path=/; expires=Tue, 22-Oct-24 15:59:41 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None, _cfuvid=qIG5Ao6fOQ9MAWT6hlX2fjC8G.yTYmXl4vzXjH7Qqsg-1729610981415-0.0.1.1-604800000; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None",
    "strict-transport-security": "max-age=31536000; includeSubDomains; preload",
    "x-content-type-options": "nosniff",
    "x-ratelimit-limit-requests": "5000",
    "x-ratelimit-limit-tokens": "450000",
    "x-ratelimit-remaining-requests": "4999",
    "x-ratelimit-remaining-tokens": "83951",
    "x-ratelimit-reset-requests": "12ms",
    "x-ratelimit-reset-tokens": "48.806s",
    "x-request-id": "req_bb581eb70b2276ea9a9c563b12f6343b"
  },
  "request_id": "req_bb581eb70b2276ea9a9c563b12f6343b",
  "error": {
    "message": "This model's maximum context length is 128000 tokens. However, your messages resulted in 148540 tokens. Please reduce the length of the messages.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  },
  "code": "context_length_exceeded",
  "param": "messages",
  "type": "invalid_request_error",
  "caller": "/home/runner/work/text-conversation-rewards/text-conversation-rewards/dist/index.js:291:6136492"
}
-->

@gentlementlegen perhaps we have too much overhead with each pull? And by that I mean headers and such not the main content? Because I don't imagine that each pull actually has that much "body" content. This easily can be optimized as I see some have barely any comments.

Originally posted by @0x4007 in ubiquity-os/ubiquity-os-kernel#80 (comment)

Copy link

@gentlementlegen
Copy link
Member Author

/start

Copy link

Warning! This task was created over 52 days ago. Please confirm that this issue specification is accurate before starting.
Deadline Thu, Dec 19, 7:35 AM UTC
Beneficiary 0x0fC1b909ba9265A846b82CF4CE352fc3e7EeB2ED

Tip

  • Use /wallet 0x0000...0000 if you want to update your registered payment wallet address.
  • Be sure to open a draft pull request as soon as possible to communicate updates on your progress.
  • Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.

@gentlementlegen
Copy link
Member Author

Was thinking about this and maybe there would be a few available approches:

  • summarization of comments: probably the most accurate but also very expensive because each comment should be summarized
  • truncation: just truncate the comment, but this implies losing precision and context
  • filter highly relevant comments first, with something like TF IDF. Not ideal but most likely the best compromise without needing to use API credits
  • take only a sample of the comments: easy to implement but losing precision and context as well

Any ideas? @sshivaditya2019 RFC

@0x4007
Copy link
Member

0x4007 commented Dec 20, 2024

I think high accuracy is the best choice from your selection. I think costs continue to decline with these LLMs as well.

@gentlementlegen
Copy link
Member Author

Let me test results with TF IDF first and see how accurate it gets, because it would also most likely be much simpler to implement that a summary of all the comments, will run some tests and post them here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment