#

trustworthy-ai

Here are 128 public repositories matching this topic...

Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

python machine-learning privacy ai attack extraction inference artificial-intelligence evasion red-team poisoning adversarial-machine-learning blue-team adversarial-examples adversarial-attacks trusted-ai trustworthy-ai

Updated Nov 15, 2024
Python

giskard

Giskard-AI / giskard

🐢 Open-Source Evaluation & Testing for ML & LLM systems

Updated Nov 15, 2024
Python

zjunlp / EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

Updated Nov 15, 2024
Jupyter Notebook

JohnSnowLabs / langtest

Deliver safe & effective language models

nlp artificial-intelligence benchmarks benchmark-framework model-assessment ai-safety mlops responsible-ai ml-safety trustworthy-ai ethics-in-ai ml-testing large-language-models llm ai-testing llm-test llm-evaluation-toolkit llm-as-evaluator llm-testing

Updated Nov 12, 2024
Python

HowieHwong / TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

nlp benchmark natural-language-processing ai toolkit evaluation dataset pypi-package trustworthy-machine-learning trustworthy-ai large-language-models llm

Updated Sep 29, 2024
Python

THUYimingLi / BackdoorBox

The open-sourced Python toolbox for backdoor attacks and defenses.

backdoor-attacks trustworthy-machine-learning backdoor-learning trustworthy-ai backdoor-defenses

Updated Jul 30, 2024
Python

aiverify-foundation / moonshot

Moonshot - A simple and modular tool to evaluate and red-team any LLM application.

benchmarking evaluation-framework red-teaming trustworthy-ai llm

Updated Nov 14, 2024
Python

liuzuxin / FSRL

🚀 A fast safe reinforcement learning library in PyTorch

library reinforcement-learning robotics decision-making pytorch sac safety-critical trpo ppo cpo safe-rl trustworthy-ai cvpo

Updated Sep 30, 2024
Python

yunqing-me / AttackVLM

[NeurIPS-2023] Annual Conference on Neural Information Processing Systems

deep-generative-model adversarial-attack trustworthy-ai foundation-models large-language-models text-to-image-generation generative-ai vision-language-model image-to-text-generation

Updated Oct 30, 2023
Python

ffhibnese / Model-Inversion-Attack-ToolBox

A comprehensive toolbox for model inversion attacks and defenses, which is easy to get started.

machine-learning privacy toolbox benchmarks model-inversion model-inversion-attacks trustworthy-ai

Updated Oct 18, 2024
Python

yunqing-me / WatermarkDM

Code of the paper: A Recipe for Watermarking Diffusion Models

text-to-image watermark generative-models diffusion-models trustworthy-ai

Updated Nov 13, 2024
Jupyter Notebook

aiverify-foundation / aiverify

AI Verify

Updated Nov 15, 2024
Python

verivital / nnv

Neural Network Verification Software Tool

neural-network verification reachability formal-methods hybrid-systems formal-verification cyber-physical autonomy cyber-physical-systems reachability-analysis robustness-verification trustworthy-machine-learning neural-network-verification trustworthy-ai safe-ai safe-autonomy neural-network-certification assured-autonomy

Updated Oct 9, 2024
MATLAB

thu-ml / MMTrustEval

A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)

benchmark privacy toolbox safety multi-modal fairness robustness claude gpt-4 trustworthy-ai truthfulness mllm

Updated Nov 5, 2024
Python

Machine-Learning-for-High-Risk-Applications-Book

ml-for-high-risk-apps-book / Machine-Learning-for-High-Risk-Applications-Book

Official code repo for the O'Reilly Book - Machine Learning for High-Risk Applications

security machine-learning deep-learning oreilly explainable-ai interpretable-machine-learning oreilly-books responsible-ai trustworthy-ai

Updated May 23, 2023
Jupyter Notebook

IBM / ai-privacy-toolkit

A toolkit for tools and techniques related to the privacy and compliance of AI models.

python machine-learning privacy ai ml artificial-intelligence gdpr anonymization mlops ai-models trustworthy-ai

Updated Jul 3, 2024
Python

sleeepeer / PoisonedRAG

[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

security machine-learning ai rag trustworthy-ai retrieval-augmented-generation

Updated Oct 10, 2024
Python

dlmacedo / entropic-out-of-distribution-detection

A project to add scalable state-of-the-art out-of-distribution detection (open set recognition) support by changing two lines of code! Perform efficient inferences (i.e., do not increase inference time) and detection without classification accuracy drop, hyperparameter tuning, or collecting additional data.

machine-learning deep-learning pytorch ood osr ai-safety open-set anomaly-detection novelty-detection robust-machine-learning open-set-recognition out-of-distribution out-of-distribution-detection ood-detection trustworthy-machine-learning trustworthy-ai

Updated Sep 22, 2022
Python

qitianwu / GraphOOD-GNNSafe

The official implementation for ICLR23 paper "GNNSafe: Energy-based Out-of-Distribution Detection for Graph Neural Networks"

deep-learning pytorch artificial-intelligence outlier-detection label-propagation geometric-deep-learning node-classification graph-neural-networks anamoly-detection pytorch-geometric out-of-distribution-detection large-graph trustworthy-ai distribution-shift out-of-distribution-generalization

Updated Jul 27, 2023
Python

ai4ce / FLAT

[ICCV2021 Oral] Fooling LiDAR by Attacking GPS Trajectory

deep-learning robotics point-cloud lidar gnss autonomous-driving ai-safety adversarial-attacks 3d-object-detection 3d-perception trustworthy-machine-learning trustworthy-ai

Updated Jul 5, 2022
Python

Improve this page

Add a description, image, and links to the trustworthy-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the trustworthy-ai topic, visit your repo's landing page and select "manage topics."