From dec64269ee16696ebd0903fa4f03e1c2cd596057 Mon Sep 17 00:00:00 2001 From: Michael Hunger Date: Mon, 15 Jul 2024 11:51:15 +0200 Subject: [PATCH] Updates for features (graph-viz) and deployment (fixes code listing) --- .../pages/llm-graph-builder-deployment.adoc | 56 ++++++++++++------- .../pages/llm-graph-builder-features.adoc | 38 +++++++++++-- 2 files changed, 68 insertions(+), 26 deletions(-) diff --git a/modules/genai-ecosystem/pages/llm-graph-builder-deployment.adoc b/modules/genai-ecosystem/pages/llm-graph-builder-deployment.adoc index dea4a5e..7553f07 100644 --- a/modules/genai-ecosystem/pages/llm-graph-builder-deployment.adoc +++ b/modules/genai-ecosystem/pages/llm-graph-builder-deployment.adoc @@ -22,27 +22,35 @@ If want to use https://neo4j.com/product/developer-tools/#desktop[Neo4j Desktop^ By default only OpenAI and Diffbot are enabled since Gemini requires extra GCP configurations. In your root folder, create a .env file with your OPENAI and DIFFBOT keys (if you want to use both): -```env + +[source,env] +---- OPENAI_API_KEY="your-openai-key" DIFFBOT_API_KEY="your-diffbot-key" -``` +---- if you only want OpenAI: -```env + +[source,env] +---- LLM_MODELS="gpt-3.5,gpt-4o" OPENAI_API_KEY="your-openai-key" -``` +---- if you only want Diffbot: -```env + +[source,env] +---- LLM_MODELS="diffbot" DIFFBOT_API_KEY="your-diffbot-key" -``` +---- You can then run Docker Compose to build and start all components: -```bash + +[source,bash] +---- docker-compose up --build -``` +---- === Configuring LLM Models @@ -61,7 +69,7 @@ To achieve that you need to set a number of environment variables: In your `.env` file, add the following lines. You can of course also add other model configurations from these providers or any OpenAI API compatible provider. [source,env] -==== +---- LLM_MODEL_CONFIG_azure_ai_gpt_35="gpt-35,https://.openai.azure.com/,," LLM_MODEL_CONFIG_anthropic_claude_35_sonnet="claude-3-5-sonnet-20240620," LLM_MODEL_CONFIG_fireworks_llama_v3_70b="accounts/fireworks/models/llama-v3-70b-instruct," @@ -71,12 +79,12 @@ LLM_MODEL_CONFIG_fireworks_qwen_72b="accounts/fireworks/models/qwen2-72b-instruc # Optional Frontend config LLM_MODELS="diffbot,gpt-3.5,gpt-4o,azure_ai_gpt_35,azure_ai_gpt_4o,groq_llama3_70b,anthropic_claude_35_sonnet,fireworks_llama_v3_70b,bedrock_claude_35_sonnet,ollama_llama3,fireworks_qwen_72b" -==== +---- In your `docker-compose.yml` you need to pass the variables through: [source,yaml] -==== +---- - LLM_MODEL_CONFIG_anthropic_claude_35_sonnet=${LLM_MODEL_CONFIG_anthropic_claude_35_sonnet-} - LLM_MODEL_CONFIG_fireworks_llama_v3_70b=${LLM_MODEL_CONFIG_fireworks_llama_v3_70b-} - LLM_MODEL_CONFIG_azure_ai_gpt_4o=${LLM_MODEL_CONFIG_azure_ai_gpt_4o-} @@ -85,21 +93,25 @@ In your `docker-compose.yml` you need to pass the variables through: - LLM_MODEL_CONFIG_bedrock_claude_3_5_sonnet=${LLM_MODEL_CONFIG_bedrock_claude_3_5_sonnet-} - LLM_MODEL_CONFIG_fireworks_qwen_72b=${LLM_MODEL_CONFIG_fireworks_qwen_72b-} - LLM_MODEL_CONFIG_ollama_llama3=${LLM_MODEL_CONFIG_ollama_llama3-} -==== +---- === Additional configs By default, the input sources will be: Local files, Youtube, Wikipedia and AWS S3. This is the default config applied if you do not overwrite it in your .env file: -```env + +[source,env] +---- REACT_APP_SOURCES="local,youtube,wiki,s3" -``` +---- If however you want the Google GCS integration, add `gcs` and your Google client ID: -```env + +[source,env] +---- REACT_APP_SOURCES="local,youtube,wiki,s3,gcs" GOOGLE_CLIENT_ID="xxxx" -``` +---- The `REACT_APP_SOURCES` should be a comma-separated list of the sources you want to enable. You can of course combine all (local, youtube, wikipedia, s3 and gcs) or remove any you don't want or need. @@ -113,23 +125,27 @@ Alternatively, you can run the backend and frontend separately: 1. Create the frontend/.env file by copy/pasting the frontend/example.env. 2. Change values as needed 3. Run: -```bash + +[source,bash] +---- cd frontend yarn yarn run dev -``` +---- - For the backend: 1. Create the backend/.env file by copy/pasting the backend/example.env. 2. Change values as needed 3. Run: -```bash + +[source,bash] +---- cd backend python -m venv envName source envName/bin/activate pip install -r requirements.txt uvicorn score:app --reload -``` +---- == ENV diff --git a/modules/genai-ecosystem/pages/llm-graph-builder-features.adoc b/modules/genai-ecosystem/pages/llm-graph-builder-features.adoc index 644bebc..3ab4188 100644 --- a/modules/genai-ecosystem/pages/llm-graph-builder-features.adoc +++ b/modules/genai-ecosystem/pages/llm-graph-builder-features.adoc @@ -71,11 +71,11 @@ image::llm-graph-builder-taxonomy.png[width=600, align=center] If you want to use a pre-defined or your own graph schema, you can do so in the Graph Enhancements popup. This is also shown the first time you construct a graph and the state of the model configuration is listed below the connection information. - You can either: - * select a pre-defined schema from the dropdown on top, - * use your own by entering the node labels and relationships, - * fetch the existing schema from an existing Neo4j database (`Use Existing Schema`), - * or copy/paste a text or schema description (also works with RDF ontologies or Cypher/GQL schema) and ask the LLM to analyze it and come up with a suggested schema (`Get Schema From Text`). +You can either: +* select a pre-defined schema from the dropdown on top, +* use your own by entering the node labels and relationships, +* fetch the existing schema from an existing Neo4j database (`Use Existing Schema`), +* or copy/paste a text or schema description (also works with RDF ontologies or Cypher/GQL schema) and ask the LLM to analyze it and come up with a suggested schema (`Get Schema From Text`). === Delete Disconnected Nodes @@ -94,6 +94,31 @@ Here we use a mixture of entity embedding, edit distance and substring containme You can select which sets of entities should be merged and exclude certain entities from the merge. //// +== Visualization + +You can visualize the _lexical_, the _entity_ graph or the full _knowledge_ graph of your extracted documents. + +=== Graph Visualization + +You have 2 options - either per document with the magifying glass icon at the end of the table or for all selected documents with the "Preview Graph" button. + +The graph visualization will show for the relevant files in a pop-up and you can filter for the type of graph you wanto to see: + +- Lexical Graph - the document and chunk nodes and their relationships +- Entity Graph - the entity nodes and their relationships +- Full Graph - all nodes and relationships + +=== Explore in Neo4j Bloom + +With the button "Explore in Neo4j Bloom" you can open the constructed knowledge graph in Neo4j Workspace for further visual exploration, querying and analysis. + +In Bloom/Explore you can run low code pattern based queries (or use the co-pilot) to fetch data from the graph and succesfully expand. If you are running against a GDS enabled instance, you can also run graph algorithms and visualize the results. +You can also interactively edit and add to the graph. + +In Neo4j Data Importer you can additionally import structured data from CSV files and connect it to your extracted knowledge graph. + +In Neo4j Query you can write Cypher queries (or use the co-pilot) to pull tabular and graph data from your database. + == Chatbot === How it works @@ -105,9 +130,10 @@ We also summarize the chat history and use it as an element to enrich the contex === Features - *Select RAG Mode* you can select vector-only or GraphRAG (vector+graph) modes +- *Chat with selected documents:* Will use the selected documents only for RAG, uses pre-filtering to achieve that +- *Details:* Will open a Retrieval information pop-up showing details on how the RAG agent collected and used sources (documents), chunks, and entities. Also provides information on the model used and the token consumption. - *Clear chat:* Will delete the current session's chat history. - *Expand view:* Will open the chatbot interface in a fullscreen mode. -- *Details:* Will open a Retrieval information pop-up showing details on how the RAG agent collected and used sources (documents), chunks, and entities. Also provides information on the model used and the token consumption. - *Copy:* Will copy the content of the response to the clipboard. - *Text-To-Speech:* Will read out loud the content of the response.