You can access the full project documentation at Gitbook Link!
- Access the Dataset
- Creating Traditional Vector Embeddings
- Embeddings Visualization in 3D
- Generating Knowledge Graphs
- PyKeen Knowledge Graph Embedding Training
- Storing Embeddings in FAISS index
- Running the KG visualiser web-app
- RAG_VLM
.
├── 1_Traditional_Vector_Embeddings # Traditional text and image embeddings using Word2Vec and CLIP
├── 2_Knowledge_Graphs # Code and resources for generating Knowledge Graphs and extracting triplets
├── 3_KG_Embeddings # Knowledge Graph Embeddings (KGE) training using PyKeen and dimensionality reduction
├── 4_Deployment_dev # Scripts for deploying and testing embedding models
├── 6_FAISS_embeddings # FAISS-based search for efficient embedding retrieval and comparisons
└── README.md # Project documentation
Follow the directories to get src, assets for image and text datasets
📂 /src:
Core code for training Knowledge Graph Embeddings (KGE) using PyKeen, including scripts, configs, and data utilities.📂 /assets:
Contains embedding results, visualizations, and key outputs from the models.📑 /notebooks:
Jupyter notebooks for visualizing and comparing traditional and Knowledge Graph Embeddings (KGE).
- The dataset of 1k reduced COYO700M dataset can be found Here
4 Methods were used to create text embeddings and 1 CLIP notebook can be accessed for Image embeddings.
Step | Description |
---|---|
1 | Open the eg.CLIP_Embeddings.ipynb notebook. |
2 | Run all the cells to load the CLIP model and generate embeddings. |
3 | Follow the instructions in the notebook to input your data and obtain embeddings. |
- Python 3.x
- Required libraries (list them here)
- Clone the repository.
git clone https://github.com/dsgiitr/kge-clip.git cd 1.Traditional_Vector_Embeddings
- Install the required libraries using
pip install -r requirements.txt
. - Open the Jupyter notebooks and follow the instructions.
To visualize text and image embeddings, use the following notebooks:
Each embedding and cluster will be saved in metadata.tsv
.
To launch TensorBoard, use:
%tensorboard --logdir /path/to/logs/embedding
Knowledge graphs foe both {text:image}
pairs were generated using the following steps:
-
Triplet Extraction
Run theRebel_extraction.ipynb
notebook to extract triplets using the BabelScape REBEL-large model. You can find the notebook here. -
Knowledge Graph Generation and Visualization
Use theKG.ipynb
notebook to generate knowledge graphs and visualize them using Neo4J, NetworkX, and Plotly. Access the notebook here.
To run a local Neo4J instance and visualize the knowledge graph:
-
Install Neo4J
Download and install Neo4J from the official site. -
Start Neo4J
Run the following code snippet to set up a Neo4J database remotely after setting up an account.
from neo4j import GraphDatabase
# Connect to Neo4j
uri = "neo4j+s://647567ec.databases.neo4j.io" # Replace with your Neo4j instance URI
username = "neo4j"
password = "mnx05CnETPwiMvSG7vQBZQwvJLz951fKhX-3zDfNVQg" # Replace with your Neo4j password
driver = GraphDatabase.driver(uri, auth=(username,password))
def create_nodes_and_relationships(tx, head, type_, tail):
query = (
"MERGE (a:head {name: $head}) "
"MERGE (b: tail {name: $tail}) "
"MERGE (a)-[r : Relation {type: $type}]->(b)"
)
tx.run(query, head=head, type=type_, tail=tail)
#df_rebel_text=df_rebel['triplet'].tolist()
# Open a session and add data
with driver.session() as session:
for row in triplets:
session.write_transaction(create_nodes_and_relationships, row['head'], row['type'], row['tail'])
print("Knowledge graph created successfully!")
driver.close()
- Run the following CyPhwer query on Neo4J Database instance:
MATCH (n)-[r]->(m)
RETURN n, r, m
The PyKeen model is trained on Text and Image KG triplets extracted using Babelscape REBEL-large
.
- Access the text KGE notebook:
pykeen_KGE_text.ipynb
- Access the image KGE notebook:
pykeen_KGE_Image.ipynb
from pykeen.pipeline import pipeline
result = pipeline(
model='TransE', # Choose a graph embedding technique
loss="softplus",
training=training_triples_factory,
testing=testing_triples_factory,
model_kwargs=dict(embedding_dim=3), # Set embedding dimensions
optimizer_kwargs=dict(lr=0.1), # Set learning rate
training_kwargs=dict(num_epochs=100, use_tqdm_batch=False), # Set number of epochs
)
The trained KGE for both text and Image are further reduced to 3D space using PCA/UMAP & t-SNE.
Result embeddings and media can be found in the assets
folder here
FAISS database was used to store the {text:image}
Vector and Knowledge Graph embeddings for using it further with RAG-LLMs
Access the FAISS index notebook here Set the dimensions as per what the LLM model needs.
import faiss
dimension=512
index=faiss.IndexFlatL2(dimension)
index.add(embeddings_img_array) #add the img embedding in faiss
index.add(embeddings_text_array) # add text embedding in faiss
faiss.write_index(index, 'faiss_traditional_vector_embedding.index')
This repository contains a Flask-based web app that supports:
- Text-Based Knowledge Graph Generation
- Image-Based Knowledge Graph Generation
- Text & Image Vector Embedding and Knowledge Graph Embedding with TensorBoard
The app utilizes Python libraries, the REBEL model, and Graphviz for advanced graph visualization.
Follow these steps to set up and run the web app.
Prerequisites
Ensure your environment meets the following requirements:
- Python 3.7 or higher
pip
(Python package installer)- Graphviz for advanced graph visualization
Installation
- Clone the Repository
Fork the project and clone it to your local machine:
git clone https://github.com/dsgiitr/kge-clip.git
cd kge-clip/deployment_dev
Set Up and Run the Flask App. Activate a virtual environment to manage dependencies:
- On Windows:
python -m venv venv
venv\Scripts\activate
- On macOS/Linux:
python3 -m venv venv
source venv/bin/activate
Install Dependencies Install the required Python packages:
pip install flask transformers torch pandas networkx matplotlib plotly graphviz
Running the Flask App Activate the Virtual Environment and start the Flask App.
- On Windows:
venv\Scripts\activate
set FLASK_APP=app.py
- On macOS/Linux:
source venv/bin/activate
export FLASK_APP=app.py
Run the Flask app with:
flask run
Open your web browser and navigate to http://127.0.0.1:5000/
to start using the app.
This module demonstrates how FAISS-based Knowledge Graph Embeddings (KGE) and Traditional Vector Embeddings (TVE) are utilized in conjunction with a Vision-Language Model (VLM) for image inference. The VLM (LLaVA) leverages CLIP embeddings for processing the test image.
- CLIP Embeddings: CLIP provides a shared latent space for images and text, enabling multimodal embeddings that are used for cross-modal retrieval.
- FAISS Index: Both the KGE (Knowledge Graph Embeddings) and TVE (Traditional Vector Embeddings) are stored in FAISS, facilitating fast similarity searches.
- VLM (LLaVA): This model was utilized to generate text descriptions from images, and the embeddings generated by the CLIP processor are used for retrieving the most similar images from FAISS indices.
-
Image Captioning with VLM (LLaVA):
- The VLM model generated the following caption for the test image:
['A young girl is smiling and showing her teeth', 'She is wearing a colorful shirt and a brown scarf']
.
- The VLM model generated the following caption for the test image:
-
CLIP Embeddings Generation:
- CLIP processor was used to create image embeddings for the test image.
-
FAISS Index Loading:
- Loaded FAISS KGE (Knowledge Graph Embeddings) and TVE (Traditional Vector Embeddings), trained on PyKeen with REBEL triplets and image embeddings.
-
Similarity Search:
- A similarity search was performed on the test image embedding across both FAISS indices (KGE and TVE).
-
Ranking of Similar Images:
- The top-ranked images were retrieved based on the highest similarity scores in both FAISS indices.
image_path = ["/content/RAG_test_image.jpeg"] image_search_embedding = get_features_from_image_path(image_path) distances, indices = index_tve.search(image_search_embedding.reshape(1, -1), 2) distances = distances[0] indices = indices[0] indices_distances = list(zip(indices, distances)) indices_distances.sort(key=lambda x: x[1], reverse=True)
-
Results:
- TVE Similarity:
[(73, 81.27001), (149, 77.19481)]
- KGE Similarity:
[(2406, 121.6897), (163, 121.454765)]
- TVE Similarity:
-
Image Relevance:
- The retrieved images from both FAISS indices were visually compared for relevance to the original test image.
-
Dependency og KGE FAISS:
- More fine tuned Triplet Extraction
- PyKeen Training methods for Embedding generation
- Combining Entity and Relation Embeddings.
Note
Detailed result and descriptions are explained in the DSG Gitbook
The results were divied into
- Traditional Vector embeddings 3D Reduced visualisation using Tensorboard. 📂 Results Folder
- Similarity scores of reduced embeddings of different Text encoder. 📂 Results Folder
- Comparing image and text vector embeddings disparity and contextual drawbacks. 📂 Results Folder
- Scene Graph Generation of {text:image} pair using VLM & Relationformer. 📂 Results Folder
- KG Visualisation with Neo4j, NetworkX, Plotly and Graphviz. 📂 Results Folder
- KG and traditional vector Embeddings .csv 📂 Results Folder
The list of core contributors to this repository are (mentioned alphabetically):
We welcome contributions to improve this project! To contribute:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Commit your changes with clear and descriptive messages.
- Push the changes to your fork and submit a pull request.
Important
Please ensure your contributions align with the project's coding standards and include relevant documentation or tests. For major changes, consider opening an issue to discuss your approach first.