llm-bandit

Companion code for the paper:

Nicolò Felicioni, Lucas Maystre, Sina Ghiassian, Kamil Ciosek. On the Importance of Uncertainty in Decision-Making with Large Language Models . TMLR.

This repository contains a reference implementation of the algorithms presented in the paper.

The paper investigates the role of uncertainty in decision-making problems with natural language as input. It focuses on the contextual bandit framework where the context information consists of text. The paper compares the greedy policy to LLM bandits that make active use of uncertainty estimation by integrating the uncertainty in a Thompson Sampling policy, employing different techniques for uncertainty estimation, such as Last layer Laplace Approximation, Diagonal Laplace Approximation, Dropout, and Epinets.

Getting Started

To get started, follow these steps:

Clone the repo locally with: git clone llm-bandit
Move to the repository: cd llm-bandit
Install the dependencies: pip install -r requirements.txt
Create the datasets: python prepare_data.py --data DATASET_NAME, where DATASET_NAME can be "hate", "imdb", "toxic", or "offensive".
Edit the configuration file called bandit_config.py
Run the main script: python main.py --ts TS_VARIANT, where TS_VARIANT can be "last_la", "la", "dropout", or "epinet".

Our codebase was tested with Python 3.10 and Cuda 11.8.

Support

Create a new issue

Contributing

We feel that a welcoming community is important and we ask that you follow Spotify's Open Source Code of Conduct in all interactions with the community.

Authors

Follow @SpotifyResearch on Twitter for updates.

License

Licensed under the Apache License, Version 2.0: https://www.apache.org/licenses/LICENSE-2.0

Security Issues?

Please report sensitive security issues via Spotify's bug-bounty program (https://hackerone.com/spotify) rather than GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
utils		utils
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CustomDataset.py		CustomDataset.py
LICENSE		LICENSE
PretrainedBanditModelHF.py		PretrainedBanditModelHF.py
README.md		README.md
bandit_config.py		bandit_config.py
main.py		main.py
prepare_data.py		prepare_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-bandit

Getting Started

Support

Contributing

Authors

License

Security Issues?

About

Releases

Packages

Languages

License

spotify-research/llm-bandit

Folders and files

Latest commit

History

Repository files navigation

llm-bandit

Getting Started

Support

Contributing

Authors

License

Security Issues?

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages