Demo of TLM: The Reliablity Solution for RAG, LLMs, and Data Enrichment

The main file to look at in this repo is the tlm_demo_new.ipynb

News! I added a new data enrichment and LLM reliability demo. Details:

Demo showing how Trustworthy Language Model add reliability scores to LLM outputs solving 4 use cases for 4 verticals.
expect typos and imperfection. For better results and more details, visit https://help.cleanlab.ai

Hacked this together in a couple hours. Shows how Cleanlab TLM can be used to improve fine-tuning of LLMs, accuracy of LLM outputs, and smart routing for RAG and agents.

Dataset used for this example: here.

Base Open AI LLM versus Cleanlab TLM Performance on the public test set

Note these results were run with the fastest version of the TLM (quality_preset="low") for speed reasons (its a hackaathon demo). For improved results, use quality_preset="best".

Base Acc (Open-AI GPT-3.5): ~65%
TLM Acc: 65.5%
TLM Acc (TLM Confidence > 0.3): 66.2%
TLM Acc (TLM Confidence > 0.5): 69.9%
TLM Acc (TLM Confidence > 0.8): 74.0%
Base (Open-AI GPT-3.5) Acc (TLM Confidence < 0.5): 55.1%

If an expert reviews/corrects the 100 samples with lowest TLM confidence score:

the resulting accuracy will be: 79%
compared to the original base acc: 65%

The TLM (Trustworthy Langauge Model) is available in Cleanlab Studio

How to use the TLM: https://help.cleanlab.ai/tutorials/tlm/

There's also a (reduced functionality) demo version available here running on free servers: https://cleanlab.ai/tlm

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
base_results.csv		base_results.csv
data.tar.gz		data.tar.gz
random500.csv		random500.csv
reliability_framework_for_rag_and_llm_agents.ipynb		reliability_framework_for_rag_and_llm_agents.ipynb
run-tlm-on-500dataset.ipynb		run-tlm-on-500dataset.ipynb
run_base_openai_model.ipynb		run_base_openai_model.ipynb
tlm_and_base_results.csv		tlm_and_base_results.csv
tlm_demo.ipynb		tlm_demo.ipynb
tlm_demo.py		tlm_demo.py
tlm_demo_new.ipynb		tlm_demo_new.ipynb
tlm_results.csv		tlm_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demo of TLM: The Reliablity Solution for RAG, LLMs, and Data Enrichment

The main file to look at in this repo is the tlm_demo_new.ipynb

Base Open AI LLM versus Cleanlab TLM Performance on the public test set

The TLM (Trustworthy Langauge Model) is available in Cleanlab Studio

About

Releases

Packages

Languages

License

cgnorthcutt/reliablity_framework_for_rag

Folders and files

Latest commit

History

Repository files navigation

Demo of TLM: The Reliablity Solution for RAG, LLMs, and Data Enrichment

The main file to look at in this repo is the tlm_demo_new.ipynb

Base Open AI LLM versus Cleanlab TLM Performance on the public test set

The TLM (Trustworthy Langauge Model) is available in Cleanlab Studio

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages