Release Release v0.2.0 · NVIDIA/GenerativeAIExamples

This release builds on the feedback received and brings many improvements, bugfixes and new features. This release is the first to include Nvidia AI Foundational models support and support for quantized LLM models. Detailed changes are listed below:

What's Added

Support for using Nvidia AI Foundational LLM models
Support for using Nvidia AI Foundational embedding models
Support for deploying and using quantized LLM models
Support for evaluating RAG pipeline

What's Changed

Repository restructing to allow better open source contributions
Upgraded dependencies for chain server container
Upgraded NeMo Inference Framework container version, no seperate sign up needed now for access.
Main README now provides more details.
Documentation improvements.
Better error handling and reporting mechanism for corner cases.
Renamed triton-inference-server container and service to llm-inference-server

What's Fixed

#13 of pipeline not able to answer questions unrelated to knowledge base
#12 typechecking while uploading PDF files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.2.0

What's Added

What's Changed

What's Fixed