`/ask` Priority - Token Optimization #807

0x4007 · 2023-09-23T13:11:19Z

! Error: This model's maximum context length is 16385 tokens. However, you requested 20407 tokens (4023 in the messages, 16384 in the completion). Please reduce the length of the messages or completion.

@Keyrxng time for compression/prioritization? Not a great first real world attempt lol.

Prioritization order:

Current issue specification
Linked issue specification (in order of linked, the first link taking higher priority than the next link)
Current issue conversation
Linked issue conversations (same ordering system)

We should use a tokenization estimator to know how much we should exclude.

Originally posted by @pavlovcik in #787 (comment)

It should also include a warning that it had to cut out some content. Perhaps even including the exact tokens used etc similar to the information that was presented in the error message above for context to the user to approximate how much was cut off.

Keyrxng · 2023-09-23T17:26:34Z

you requested 20407 tokens (4023 in the messages, 16384 in the completion)

Token Limit is equivalent to GPT output, whatever you set to be the token limit is the maximum that GPT will respond with but that also has to include the input, bbut we determine the input so we can't set that really.

The py package tiktoken is the best tokenization optimization package, there is a ts wrapper for it otherwise it'll be a case of using langchain and creating our own textSplitters and basing our input tokens on that which will be a rough but close estimate

Keyrxng · 2023-09-23T17:27:38Z

This issue is a non-starter really my friend as it was user error this time around but i'll still take the bounty lmao ;))

0x4007 · 2023-09-23T23:46:13Z

I'll wait until we get some real world use cases functional before we optimize

Keyrxng · 2023-10-02T20:49:29Z

a crude workaround could be that if the response from gpt is an error message stating the token count and how much we are over by we can make an educated guess as to how many chars to strip from the context in order to meet the token limit set?

Another could be to use langchain to interact with openai and allow for -1 to be passed in for max_tokens

this.llm = new OpenAI({
            openAIApiKey: this.apiKey,
            modelName: 'gpt-3.5-turbo-16k',
            maxTokens: -1,
        })

0x4007 added Priority: 3 (High) Time: <4 Hours labels Sep 23, 2023

ubiquibot bot added Price: 225 USD labels Sep 23, 2023

ubiquibot bot mentioned this issue Sep 23, 2023

/ask Priority - Token Optimization ubiquity/devpool-directory#954

Closed

0x4007 removed Priority: 3 (High) Time: <4 Hours Price: 225 USD labels Sep 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`/ask` Priority - Token Optimization #807

`/ask` Priority - Token Optimization #807

0x4007 commented Sep 23, 2023 •

edited

Loading

Keyrxng commented Sep 23, 2023

Keyrxng commented Sep 23, 2023

0x4007 commented Sep 23, 2023

Keyrxng commented Oct 2, 2023

/ask Priority - Token Optimization #807

/ask Priority - Token Optimization #807

Comments

0x4007 commented Sep 23, 2023 • edited Loading

Keyrxng commented Sep 23, 2023

Keyrxng commented Sep 23, 2023

0x4007 commented Sep 23, 2023

Keyrxng commented Oct 2, 2023

`/ask` Priority - Token Optimization #807

`/ask` Priority - Token Optimization #807

0x4007 commented Sep 23, 2023 •

edited

Loading