Skip to content

rohitinu6/Lung_Cancer_Prediction_Using_Machine_Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lung_Cancer_Prediction_Using_Machine_Learning

Aim:

The purpose of this project is to comapare Classification algorithms implemented on Lung Cancer Dataset

Dataset:

The Lung cancer dataset used in the project has been collected from data.world whose link is:

https://data.world/sta427ceyin/survey-lung-cancer

Working:

We have selected 10 of the following classification algorithms that have been used in this project:

  1. Logistic Regression
  2. K-Nearest Neighbors (KNN)
  3. Decision Tree
  4. Support Vector Machines (SVM)
  5. Naive Bayes
  6. Random Forest
  7. Gradient Boosting
  8. Neural Networks
  9. AdaBoost
  10. XGBoost

Then we build the model for each of the above mentioned algorithms. Using the following Evaluation Metrics we have compared the algorithms:

  1. Accuracy
  2. Precision
  3. F1 Score
  4. Recall Score
  5. Confusion Matrix

These are the accuracies of the algorithms:

  1. Logistic Regression: 90.29%
  2. K-Nearest Neighbors (KNN): 87.37%
  3. Decision Tree: 87.37%
  4. Support Vector Machines (SVM): 84.46%
  5. Naive Bayes: 86.4%
  6. Random Forest: 89.32%
  7. Gradient Boosting: 89.32%
  8. Neural Networks: 84.46%
  9. AdaBoost: 84.46%
  10. XGBoost: 84.46%

Results:

Out of all the algorithms so implemented, Logistic Regression performed the best. The evaluation metrics for Logistic Regression is as follows:

Accuracy: 0.9029126213592233

Precision: 0.9052631578947369

Recall: 0.9885057471264368

F1 score: 0.945054945054945

Confusion Matrix:

download