trustworthy-ai

Here are 133 public repositories matching this topic...

Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

python machine-learning privacy ai attack extraction inference artificial-intelligence evasion red-team poisoning adversarial-machine-learning blue-team adversarial-examples adversarial-attacks trusted-ai trustworthy-ai

Updated Dec 23, 2024
Python

Giskard-AI / giskard

Sponsor

Star

🐢 Open-Source Evaluation & Testing for AI & LLM systems

Updated Dec 19, 2024
Python

zjunlp / EasyEdit

Star

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

Updated Dec 29, 2024
Jupyter Notebook

JohnSnowLabs / langtest

Star

Deliver safe & effective language models

nlp artificial-intelligence benchmarks benchmark-framework model-assessment ai-safety mlops responsible-ai ml-safety trustworthy-ai ethics-in-ai ml-testing large-language-models llm ai-testing llm-test llm-evaluation-toolkit llm-as-evaluator llm-testing

Updated Dec 27, 2024
Python

HowieHwong / TrustLLM

Star

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

nlp benchmark natural-language-processing ai toolkit evaluation dataset pypi-package trustworthy-machine-learning trustworthy-ai large-language-models llm

Updated Sep 29, 2024
Python

THUYimingLi / BackdoorBox

Star

The open-sourced Python toolbox for backdoor attacks and defenses.

backdoor-attacks trustworthy-machine-learning backdoor-learning trustworthy-ai backdoor-defenses

Updated Jul 30, 2024
Python

aiverify-foundation / moonshot

Star

Moonshot - A simple and modular tool to evaluate and red-team any LLM application.

benchmarking evaluation-framework red-teaming trustworthy-ai llm

Updated Dec 27, 2024
Python

yunqing-me / AttackVLM

Star

[NeurIPS-2023] Annual Conference on Neural Information Processing Systems

deep-generative-model adversarial-attack trustworthy-ai foundation-models large-language-models text-to-image-generation generative-ai vision-language-model image-to-text-generation

Updated Dec 22, 2024
Python

liuzuxin / FSRL

Star

🚀 A fast safe reinforcement learning library in PyTorch

library reinforcement-learning robotics decision-making pytorch sac safety-critical trpo ppo cpo safe-rl trustworthy-ai cvpo

Updated Sep 30, 2024
Python

tsinghua-fib-lab / ANeurIPS2024_SPV-MIA

Star

[NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration"

membership-inference-attack trustworthy-ai large-language-models

Updated Jun 25, 2024
Python

ffhibnese / Model-Inversion-Attack-ToolBox

Star

A comprehensive toolbox for model inversion attacks and defenses, which is easy to get started.

machine-learning privacy toolbox benchmarks model-inversion model-inversion-attacks trustworthy-ai

Updated Dec 21, 2024
Python

yunqing-me / WatermarkDM

Star

Code of the paper: A Recipe for Watermarking Diffusion Models

text-to-image watermark generative-models diffusion-models trustworthy-ai

Updated Nov 13, 2024
Jupyter Notebook

aiverify-foundation / aiverify

Star

AI Verify

trustworthy-ai

Updated Dec 27, 2024
Python

thu-ml / MMTrustEval

Star

A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)

benchmark privacy toolbox safety multi-modal fairness robustness claude gpt-4 trustworthy-ai truthfulness mllm

Updated Nov 5, 2024
Python

verivital / nnv

Star

Neural Network Verification Software Tool

neural-network verification reachability formal-methods hybrid-systems formal-verification cyber-physical autonomy cyber-physical-systems reachability-analysis robustness-verification trustworthy-machine-learning neural-network-verification trustworthy-ai safe-ai safe-autonomy neural-network-certification assured-autonomy

Updated Dec 7, 2024
MATLAB

sleeepeer / PoisonedRAG

Star

[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

security machine-learning ai rag trustworthy-ai retrieval-augmented-generation

Updated Oct 10, 2024
Python

ml-for-high-risk-apps-book / Machine-Learning-for-High-Risk-Applications-Book

Star

Official code repo for the O'Reilly Book - Machine Learning for High-Risk Applications

security machine-learning deep-learning oreilly explainable-ai interpretable-machine-learning oreilly-books responsible-ai trustworthy-ai

Updated May 23, 2023
Jupyter Notebook

IBM / ai-privacy-toolkit

Star

A toolkit for tools and techniques related to the privacy and compliance of AI models.

python machine-learning privacy ai ml artificial-intelligence gdpr anonymization mlops ai-models trustworthy-ai

Updated Jul 3, 2024
Python

qitianwu / GraphOOD-GNNSafe

Star

The official implementation for ICLR23 paper "GNNSafe: Energy-based Out-of-Distribution Detection for Graph Neural Networks"

deep-learning pytorch artificial-intelligence outlier-detection label-propagation geometric-deep-learning node-classification graph-neural-networks anamoly-detection pytorch-geometric out-of-distribution-detection large-graph trustworthy-ai distribution-shift out-of-distribution-generalization

Updated Jul 27, 2023
Python

dlmacedo / entropic-out-of-distribution-detection

Star

A project to add scalable state-of-the-art out-of-distribution detection (open set recognition) support by changing two lines of code! Perform efficient inferences (i.e., do not increase inference time) and detection without classification accuracy drop, hyperparameter tuning, or collecting additional data.

machine-learning deep-learning pytorch ood osr ai-safety open-set anomaly-detection novelty-detection robust-machine-learning open-set-recognition out-of-distribution out-of-distribution-detection ood-detection trustworthy-machine-learning trustworthy-ai

Updated Sep 22, 2022
Python

Improve this page

Add a description, image, and links to the trustworthy-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the trustworthy-ai topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trustworthy-ai

Here are 133 public repositories matching this topic...

Trusted-AI / adversarial-robustness-toolbox

Giskard-AI / giskard

zjunlp / EasyEdit

JohnSnowLabs / langtest

HowieHwong / TrustLLM

THUYimingLi / BackdoorBox

aiverify-foundation / moonshot

yunqing-me / AttackVLM

liuzuxin / FSRL

tsinghua-fib-lab / ANeurIPS2024_SPV-MIA

ffhibnese / Model-Inversion-Attack-ToolBox

yunqing-me / WatermarkDM

aiverify-foundation / aiverify

thu-ml / MMTrustEval

verivital / nnv

sleeepeer / PoisonedRAG

ml-for-high-risk-apps-book / Machine-Learning-for-High-Risk-Applications-Book

IBM / ai-privacy-toolkit

qitianwu / GraphOOD-GNNSafe

dlmacedo / entropic-out-of-distribution-detection

Improve this page

Add this topic to your repo