Skip to content

components Retrieval Augmented Generation documentation

github-actions[bot] edited this page Sep 27, 2023 · 12 revisions

Retrieval Augmented Generation

Components in this category


LLM models have token limits for the prompts passed to them, this is a limiting factor at embedding time and even more limiting at prompt completion time as only so much context ca...

LLM models have token limits for the prompts passed to them, this is a limiting factor at embedding time and even more limiting at prompt completion time as only so much context ca...

  • llm_rag_create_faiss_index

    Creates a FAISS index from embeddings. The index will be saved to the output folder. The index will be registered as a Data Asset named asset_name if register_output is set to True.

  • llm_rag_create_promptflow

    This component is used to create a RAG flow based on your mlindex data and best prompts. The flow will look into your indexed data and give answers based on your own data context. The flow also provides the capability to bulk test with any built-in or custom evaluation flows.

  • llm_rag_data_import_acs

    Collects documents from Azure Cognitive Search Index, extracts their contents, saves them to a uri folder, and creates an MLIndex yaml file to represent the search index.

Documents collected can then be used in other components without having to query the ACS index again, allowing for a consiste...

chunks_source is expected to contain csv files containing two columns:

  • "Chunk" - Chunk of text to be embedded
  • "Metadata" - JSON object containing metadata for the chunk

If embeddings_container is supplied, input c...

chunks_source is expected to contain csv files containing two columns:

  • "Chunk" - Chunk of text to be embedded
  • "Metadata" - JSON object containing metadata for the chunk

If previous_embeddings is supplied, input ch...

A chunk of text is read from each input document and sent to the specified LLM with a prompt to create a question and answer based on that text. These question, answer, and context sets are saved as either a csv or j...

The Index will have the following fields populated:

  • "id", String, key=True

  • "content", String,

  • "content_vector_(open_ai|hugging_face)", Collection(Single)

  • "c...

  • llm_rag_validate_deployments

    Validates that completion model, embedding model, and Azure Cognitive Search resource deployments is successful and connections works. For default AOAI, it attempts to create the deployments if not valid or present. This validation is done only if customer is using Azure Open AI models or creatin...

Clone this wiki locally