Multi-Label-Classification

This repository captures experimentation with the scikit-multilearn package and its uses in multi-label classification.

To navigate this repository, the following files are available for exploration:

101-data-exploratiion.ipynb: This code explores the dataset [PubMed Multi Label Text Classification Dataset.csv].
102_data_preprocessing.ipynb: The code can be run from this Jupyter Notebook down, with preprocessing conducted to prepare the data.
103_data_vectorisation_and_modelling.ipynb: This code takes the preprocessed data and vectorises text data for modelling.

Some further files for consideration are:

modelling_notes_and_caveats.md: This notebook covers the modelling features and caveats encountered in this project.
data: The data folder includes the raw data [PubMed Multi Label Text Classification Dataset.csv], the features and labels and the vectorised data used for modelling.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
101_data_exploration.ipynb		101_data_exploration.ipynb
102_data_preprocessing.ipynb		102_data_preprocessing.ipynb
103_data_vectorisation_and_modelling.ipynb		103_data_vectorisation_and_modelling.ipynb
README.md		README.md
modelling_notes_and_caveats.md		modelling_notes_and_caveats.md

Provide feedback