You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have converted my finetuned hugging face model to .gguf format and triggered the inference with ctransformers.
I am using a CUDA GPU machine.
But i did not observe any kind of inference speed improvement after the inference by ctransformers. Observing the same latency in transformer based infernce and ctransformer based inference.
The text was updated successfully, but these errors were encountered:
pradeepdev-1995
changed the title
is ctransformers boost the inference speed in llm inference?
Does ctransformers boost the inference speed in llm inference?
Feb 15, 2024
I have converted my finetuned hugging face model to .gguf format and triggered the inference with ctransformers.
I am using a CUDA GPU machine.
But i did not observe any kind of inference speed improvement after the inference by ctransformers. Observing the same latency in transformer based infernce and ctransformer based inference.
The text was updated successfully, but these errors were encountered: