From 2b32084bec4fd6d47e04677d8d69a71908c20c71 Mon Sep 17 00:00:00 2001 From: Kristian Aune Date: Thu, 19 Dec 2024 13:13:33 +0100 Subject: [PATCH] typos --- .../cross-encoders-for-global-reranking.ipynb | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/sphinx/source/examples/cross-encoders-for-global-reranking.ipynb b/docs/sphinx/source/examples/cross-encoders-for-global-reranking.ipynb index 1ffcc036..6bbf0383 100644 --- a/docs/sphinx/source/examples/cross-encoders-for-global-reranking.ipynb +++ b/docs/sphinx/source/examples/cross-encoders-for-global-reranking.ipynb @@ -14,13 +14,13 @@ "\n", "First, let us recap what cross-encoders are and where they might fit in a Vespa application.\n", "\n", - "In contrast to bi-encoders, it is important to know that cross-encoders do NOT produce an embedding. Instead a cross-encoder acts on _pairs_ of input sequences and produces a single scalar score between 0 and 1 indicating the similarity or relevance between the two sentences.\n", + "In contrast to bi-encoders, it is important to know that cross-encoders do NOT produce an embedding. Instead, a cross-encoder acts on _pairs_ of input sequences and produces a single scalar score between 0 and 1, indicating the similarity or relevance between the two sentences.\n", "\n", - "> The cross-encoder model is a transformer based model with a classification head on top of the Transformer CLS token (classification token).\n", + "> The cross-encoder model is a transformer-based model with a classification head on top of the Transformer CLS token (classification token).\n", ">\n", "> The model has been fine-tuned using the MS Marco passage training set and is a binary classifier which classifies if a query,document pair is relevant or not.\n", "\n", - "The quote is from [this](https://blog.vespa.ai/pretrained-transformer-language-models-for-search-part-4/) blog post from 2021 that explains cross-encoders more in depth. Note that the reference to the MS Marco dataset is for the model used in the blog post, and not the model we will use in this notebook.\n", + "The quote is from [this](https://blog.vespa.ai/pretrained-transformer-language-models-for-search-part-4/) blog post from 2021 that explains cross-encoders more in-depth. Note that the reference to the MS Marco dataset is for the model used in the blog post, and not the model we will use in this notebook.\n", "\n", "## Properties of cross-encoders and where they fit in Vespa\n", "\n", @@ -30,7 +30,7 @@ "\n", "However, this leaderboard does not evaluate a solution's latency, and for production systems, doing cross-encoder inference for all documents in a corpus become prohibitively expensive.\n", "\n", - "With Vespa's phased ranking capabilites, doing cross-encoder inference for a subset of documents at a later stage in the ranking pipeline can be a good trade-off between ranking performance and latency.\n", + "With Vespa's phased ranking capabilities, doing cross-encoder inference for a subset of documents at a later stage in the ranking pipeline can be a good trade-off between ranking performance and latency.\n", "For the remainder of this notebook, we will look at using a cross-encoder in _global-phase reranking_, introduced in [this](https://blog.vespa.ai/improving-llm-context-ranking-with-cross-encoders/) blog post.\n", "\n", "![improving-llm-context-ranking-with-cross-encoders](https://blog.vespa.ai/assets/2023-05-08-improving-llm-context-ranking-with-cross-encoders/image1.png)\n", @@ -52,7 +52,7 @@ "\n", "For this demo, we will use [mixedbread-ai/mxbai-rerank-xsmall-v1](https://huggingface.co/mixedbread-ai/mxbai-rerank-xsmall-v1), but you can experiment with the larger models, depending on how you want to balance speed, accuracy, and cost (if you want to use GPU).\n", "\n", - "This model is really powerful despite its small size, and provide a good trade-off between speed and accuracy.\n", + "This model is really powerful despite its small size, and provides a good trade-off between speed and accuracy.\n", "\n", "Table of accuracy on a [BEIR](http://beir.ai) (11 datasets):\n", "\n", @@ -68,7 +68,7 @@ "\n", "(Table from mixedbread.ai's introductory [blog post](https://www.mixedbread.ai/blog/mxbai-rerank-v1).)\n", "\n", - "As we can see, the `mxbai-rerank-xsmall-v1` model is almost on par with much larger models, while being much faster and cheaper to run.\n" + "As we can see, the `mxbai-rerank-xsmall-v1` model is almost on par with much larger models while being much faster and cheaper to run.\n" ] }, { @@ -116,9 +116,9 @@ "\n", "It is useful to inspect the expected inputs and outputs, along with their shapes, before integrating the model into Vespa.\n", "\n", - "This can either be done by for instance by using the `sentence_transformers` and `onnxruntime` libraries.\n", + "This can either be done by, for instance, by using the `sentence_transformers` and `onnxruntime` libraries.\n", "\n", - "One-off tasks like this are well suited for a Colab notebook, one example on how to do this in colab can be found here:\n", + "One-off tasks like this are well suited for a Colab notebook. One example of how to do this in Colab can be found here:\n", "\n", "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1DfubYQNyBWzBpgyGUBPN5gaIuSJZ28wy)\n", "\n", @@ -648,4 +648,4 @@ }, "nbformat": 4, "nbformat_minor": 4 -} \ No newline at end of file +}