git clone https://github.com/FreedomIntelligence/DotaGPT.git
pip install -r requirements.txt
Download the datasets from Hugging Face:
Data Format
For DotaBench
, the data is structured as follows. Each entry is a JSON object representing a series of interaction turns with a reference answer:
{
"id": 0,
"turn_1_question": "example question 1",
"turn_1_answer": "[model-generated answer for turn 1]",
"turn_2_question": "example question 2",
"turn_2_answer": "[model-generated answer for turn 2]",
"turn_3_question": "example question 3",
"turn_3_answer": "[model-generated answer for turn 3]",
"reference": "example reference"
}
Complete the fields: turn_1_answer
, turn_2_answer
, turn_3_answer
.
For DoctorFLAN
, the data format is as follows, with each entry representing a single-turn interaction:
{
"id": 0,
"input": "example input",
"output": "[model-generated output]",
"reference": "example reference answer"
}
Complete the field: output
.
Store the generated model responses in the location: data/{eval_set}/{model_name}.jsonl
. Ensure that all required fields are correctly filled.
Prepare a YAML configuration file specifying model details, API keys, etc. Example (configs/eval.yaml
):
api_key: "your-openai-api-key"
base_url: "https://api.openai.com"
gpt_version: "gpt-4"
Execute the evaluator with the script script/run.sh
, modifying parameters as necessary. Example command:
python eval_code/reviewer.py \
--config configs/eval.yaml \
--model_name Baichuan-13B-Chat \
--eval_set DotaBench \
--turn_type multi \
--n_processes 2 \
--n_repeat 2 \
--turn_num 2
Parameter Explanation
--config
: Path to the configuration file.--model_name
: Name of the model being evaluated.--eval_set
: Evaluation dataset being used. Choose eitherDoctorFLAN
orDotaBench
.--turn_type
: Type of interaction (single or multi-turn).--n_processes
: Number of processes for parallel processing.--n_repeat
: Number of repetitions for each sample.--turn_num
: Number of turns for multi-turn evaluations.
Contributions are welcome! Feel free to submit issues or pull requests on GitHub to help improve this project.
The code in this repository is mostly developed for or derived from the paper below.
@article{xie2024llms,
title={LLMs for Doctors: Leveraging Medical LLMs to Assist Doctors, Not Replace Them},
author={Xie, Wenya and Xiao, Qingying and Zheng, Yu and Wang, Xidong and Chen, Junying and Ji, Ke and Gao, Anningzhe and Wan, Xiang and Jiang, Feng and Wang, Benyou},
journal={arXiv preprint arXiv:2406.18034},
year={2024}
}
This project is licensed under the MIT License.
For inquiries, please create an issue in this repository or email the authors: [email protected]