Replies: 1 comment
-
TL;DROverall, the APIs are very similar to the APIs we defined. The major differences are:
APIs:Llama Stack defines several APIs for model evaluation, including:
Foreseeable Efforts:Mapping the Llama Stacks APIs to LM-Eval APIs is pretty straightforward. The extra effort outside of the LM-Eval scope is to bring up the model and kick off the inference service which allows the LM-Eval to perform the evaluation process. Or finding the inference endpoint for the evaluation model. Flow DiagramsequenceDiagram
CLI ->>+ Llam Stacks Eval: submit an evaluation job
Llam Stacks Eval ->>+ LM-Eval-aaS: get the model and forward the request
LM-Eval-aaS ->>- Llam Stacks Eval: return the job id
Llam Stacks Eval ->>- CLI: relay the job id
LM-Eval-aaS ->>+ Model Inference: perform the evaluation process
CLI ->>+ Llam Stacks Eval: get evaluation results
Llam Stacks Eval ->>+ LM-Eval-aaS: forward the request
LM-Eval-aaS ->>- Llam Stacks Eval: return the evaluation results and wrap as the artifacts
Llam Stacks Eval ->>- CLI: return the artifacts
|
Beta Was this translation helpful? Give feedback.
-
Llama Stack defines the building blocks needed to bring generative AI applications to market. One set of the APIs is about the model evaluation, which aligns with the purpose of LM-Eval-aaS in this repository. Let's use this discussion thread to explore the possibility of synergy between the evaluation APIs of Llama Stack and LM-Eval-aaS.
Beta Was this translation helpful? Give feedback.
All reactions