paper-source-trace-chinesegpt-rank4

Prerequisites

Linux
Python 3.10.12
PyTorch 2.0.0
CUDA 12.0
NVADIA RTX4090

Getting Started

Installation

Clone this repo.

git clone 
cd paper-source-trace

Please install dependencies by

pip install -r requirements.txt

PST Dataset

The dataset can be downloaded from BaiduPan with password bft3, Aliyun or DropBox. The paper XML files are generated by Grobid APIs from paper pdfs.

Directory structure

--paper-source-trace-main
	--scibert_scivocab_uncased
	--out
		--kddcup
			--scibert_eval1-434
			--scibert_eval1-429
			--scibert_eval1-423
	--data
    		--pst
    			--paper-xml(load competition dataset)

Train&Inference

# Three models were trained with different parameters
python bert-eval-434.py
python bert-eval-429.py
python bert-eval-423.py
# output at out/kddcup/ (model weight and result)

# inference
注释掉其他训练函数，配置好权重和测试文件执行gen_kddcup_valid_submission_bert函数即可
if __name__ == "__main__":
    seed=2023
    setup_seed(seed)
    #prepare_bert_input()
    #train(model_name="scibert")
    gen_kddcup_valid_submission_bert(model_name="scibert")

# Fusion of model results
python rong.py
#output at out/kddcup/scibert_rong/

Model weight

here are three model weight and pretrain weight： 2024-kddcup-pst-rank5-chinesegpt https://pan.baidu.com/s/1gIt6ZzZGOTRW6VeFcDRu6w password：eyla 权重中包含了推理的结果

Method

We do further experiments based on the baseline code, the main method is to process the training data and parameter tuning, including the addition of the title of the cited paper, which can be found in the prepare_bert_input function in bert_eval-434.py.

Results on Test Set

Method	MAP
model1	0.434
model2	0.429
model3	0.423
ensemble	0.449

If you have any questions, please contact me. Email:[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
scibert_scivocab_uncased		scibert_scivocab_uncased
.gitignore		.gitignore
README.md		README.md
bert-eval-423.py		bert-eval-423.py
bert-eval-429.py		bert-eval-429.py
bert-eval-434.py		bert-eval-434.py
bert.py		bert.py
data.py		data.py
eval.py		eval.py
net_emb.py		net_emb.py
pretrain.py		pretrain.py
requirements.txt		requirements.txt
rong.py		rong.py
rule.py		rule.py
settings.py		settings.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

paper-source-trace-chinesegpt-rank4

Prerequisites

Getting Started

Installation

PST Dataset

Directory structure

Train&Inference

Model weight

Method

Results on Test Set

About

Releases

Packages

Languages

xgl0626/paper-source-trace-chinesegpt-rank4

Folders and files

Latest commit

History

Repository files navigation

paper-source-trace-chinesegpt-rank4

Prerequisites

Getting Started

Installation

PST Dataset

Directory structure

Train&Inference

Model weight

Method

Results on Test Set

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages