Skip to content

Latest commit

 

History

History
30 lines (28 loc) · 2.55 KB

rag_eval_reproduced_summary.md

File metadata and controls

30 lines (28 loc) · 2.55 KB

RAG Evaluations in LlamaIndex

Reproduce LlamaIndex Blog

Embedding Models WithoutReranker
[hit_rate/mrr]
CohereRerank
[hit_rate/mrr]
bge-reranker-base
[hit_rate/mrr]
bge-reranker-large
[hit_rate/mrr]
bge-reranker-v2-m3
[hit_rate/mrr]
bce-reranker-base_v1
[hit_rate/mrr]
OpenAI-ada-2 88.18/64.95 90.45/75.29 91.36/75.74 91.36/76.72 90.00/76.25 92.27/78.33
OpenAI-embed-3-small 88.64/64.45 89.09/72.99 90.91/74.72 90.45/76.07 89.55/75.35 91.36/77.50
OpenAI-embed-3-large 91.82/67.98 93.18/77.62 94.55/78.76 95.00/80.02 94.09/78.11 95.91/81.76
bge-large-en 81.36/59.84 86.36/71.61 87.27/73.93 86.82/75.23 85.91/72.83 88.18/77.36
bge-base-en-v1.5 81.36/57.43 88.64/73.73 89.55/75.23 88.18/74.89 88.18/74.23 89.09/76.89
bge-large-en-v1.5 83.18/64.34 92.27/76.45 93.18/78.57 92.73/79.59 92.27/77.81 94.09/81.74
bge-m3-large 89.09/69.34 92.27/77.36 93.64/78.47 91.82/78.79 91.82/77.77 94.09/81.14
llm-embedder 75.91/54.50 80.91/67.70 81.82/70.05 81.36/69.86 82.27/68.55 82.73/71.38
CohereV2-en 74.09/51.30 80.91/68.38 82.73/69.86 82.27/69.33 80.91/68.94 83.18/72.58
CohereV3-en 81.36/58.88 87.73/72.08 88.18/75.29 88.64/75.28 88.18/73.52 89.09/76.82
JinaAI-v2-Small-en 80.45/57.85 87.73/73.28 88.64/73.72 88.64/74.39 88.64/73.77 90.00/76.98
JinaAI-v2-Base-en 85.00/61.55 89.55/73.64 90.00/75.52 89.09/75.75 89.55/74.33 90.91/78.18
gte-large-en 82.27/60.28 90.00/73.77 90.00/75.94 90.00/76.80 89.55/76.02 91.36/78.42
e5-large-v2-en 88.64/63.80 90.91/75.63 91.36/76.64 91.82/76.90 90.91/76.53 92.73/79.32
e5-base-multilingual 87.73/64.21 90.45/75.42 93.18/77.20 91.82/78.39 92.73/77.11 93.64/80.53
e5-large-multilingual 87.27/64.28 90.00/75.33 90.00/75.94 90.00/76.17 89.55/75.51 91.36/78.64
bce-embedding-base_v1 91.36/71.20 92.73/77.65 95.00/79.01 95.00/79.95 94.55/79.20 96.36/82.20