New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

result unit #5

Open

hahazei opened this issue Feb 23, 2024 · 2 comments

hahazei commented Feb 23, 2024

cloud you tell me the result unit? like "Time To First Token", it's second or ms
=================================== Summary ====================================
Provider : openai
Model : /data/model/baichuan2-13b-chat/
Prompt Tokens : 39.0
Generation Tokens : 2048
Stream : True
Temperature : 1.0
Logprobs : None
Concurrency : QPS 50.0 constant
Time To First Token: 5.705300167132269
Latency Per Token : 135.50119360148753
Num Tokens : 258.92857142857144
Total Latency : 28838.560053018486
Num Requests : 112
Qps : 2.0004955480459414

Contributor

zchenyu commented Feb 23, 2024

I believe it's milliseconds

Author

hahazei commented Feb 26, 2024

I believe it's milliseconds

ok, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment