Skip to content

Commit

Permalink
Merge pull request #996 from vespa-engine/kkraune-patch-1
Browse files Browse the repository at this point in the history
typos
  • Loading branch information
kkraune authored Dec 19, 2024
2 parents bfa14bc + 2b32084 commit 9d8a797
Showing 1 changed file with 9 additions and 9 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,13 @@
"\n",
"First, let us recap what cross-encoders are and where they might fit in a Vespa application.\n",
"\n",
"In contrast to bi-encoders, it is important to know that cross-encoders do NOT produce an embedding. Instead a cross-encoder acts on _pairs_ of input sequences and produces a single scalar score between 0 and 1 indicating the similarity or relevance between the two sentences.\n",
"In contrast to bi-encoders, it is important to know that cross-encoders do NOT produce an embedding. Instead, a cross-encoder acts on _pairs_ of input sequences and produces a single scalar score between 0 and 1, indicating the similarity or relevance between the two sentences.\n",
"\n",
"> The cross-encoder model is a transformer based model with a classification head on top of the Transformer CLS token (classification token).\n",
"> The cross-encoder model is a transformer-based model with a classification head on top of the Transformer CLS token (classification token).\n",
">\n",
"> The model has been fine-tuned using the MS Marco passage training set and is a binary classifier which classifies if a query,document pair is relevant or not.\n",
"\n",
"The quote is from [this](https://blog.vespa.ai/pretrained-transformer-language-models-for-search-part-4/) blog post from 2021 that explains cross-encoders more in depth. Note that the reference to the MS Marco dataset is for the model used in the blog post, and not the model we will use in this notebook.\n",
"The quote is from [this](https://blog.vespa.ai/pretrained-transformer-language-models-for-search-part-4/) blog post from 2021 that explains cross-encoders more in-depth. Note that the reference to the MS Marco dataset is for the model used in the blog post, and not the model we will use in this notebook.\n",
"\n",
"## Properties of cross-encoders and where they fit in Vespa\n",
"\n",
Expand All @@ -30,7 +30,7 @@
"\n",
"However, this leaderboard does not evaluate a solution's latency, and for production systems, doing cross-encoder inference for all documents in a corpus become prohibitively expensive.\n",
"\n",
"With Vespa's phased ranking capabilites, doing cross-encoder inference for a subset of documents at a later stage in the ranking pipeline can be a good trade-off between ranking performance and latency.\n",
"With Vespa's phased ranking capabilities, doing cross-encoder inference for a subset of documents at a later stage in the ranking pipeline can be a good trade-off between ranking performance and latency.\n",
"For the remainder of this notebook, we will look at using a cross-encoder in _global-phase reranking_, introduced in [this](https://blog.vespa.ai/improving-llm-context-ranking-with-cross-encoders/) blog post.\n",
"\n",
"![improving-llm-context-ranking-with-cross-encoders](https://blog.vespa.ai/assets/2023-05-08-improving-llm-context-ranking-with-cross-encoders/image1.png)\n",
Expand All @@ -52,7 +52,7 @@
"\n",
"For this demo, we will use [mixedbread-ai/mxbai-rerank-xsmall-v1](https://huggingface.co/mixedbread-ai/mxbai-rerank-xsmall-v1), but you can experiment with the larger models, depending on how you want to balance speed, accuracy, and cost (if you want to use GPU).\n",
"\n",
"This model is really powerful despite its small size, and provide a good trade-off between speed and accuracy.\n",
"This model is really powerful despite its small size, and provides a good trade-off between speed and accuracy.\n",
"\n",
"Table of accuracy on a [BEIR](http://beir.ai) (11 datasets):\n",
"\n",
Expand All @@ -68,7 +68,7 @@
"\n",
"(Table from mixedbread.ai's introductory [blog post](https://www.mixedbread.ai/blog/mxbai-rerank-v1).)\n",
"\n",
"As we can see, the `mxbai-rerank-xsmall-v1` model is almost on par with much larger models, while being much faster and cheaper to run.\n"
"As we can see, the `mxbai-rerank-xsmall-v1` model is almost on par with much larger models while being much faster and cheaper to run.\n"
]
},
{
Expand Down Expand Up @@ -116,9 +116,9 @@
"\n",
"It is useful to inspect the expected inputs and outputs, along with their shapes, before integrating the model into Vespa.\n",
"\n",
"This can either be done by for instance by using the `sentence_transformers` and `onnxruntime` libraries.\n",
"This can either be done by, for instance, by using the `sentence_transformers` and `onnxruntime` libraries.\n",
"\n",
"One-off tasks like this are well suited for a Colab notebook, one example on how to do this in colab can be found here:\n",
"One-off tasks like this are well suited for a Colab notebook. One example of how to do this in Colab can be found here:\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1DfubYQNyBWzBpgyGUBPN5gaIuSJZ28wy)\n",
"\n",
Expand Down Expand Up @@ -648,4 +648,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}

0 comments on commit 9d8a797

Please sign in to comment.