Skip to content

This project predicts whether a flight will be delayed or on time using various Machine Learning algorithms. We explore flight delay data, perform data preprocessing, train multiple classification models, and evaluate them using accuracy, ROC curves, and feature importance.

Notifications You must be signed in to change notification settings

NirmalanSK/Airlines_Delay_Flight_Prediction_using_ML_Algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

✈️ Airlines Delay Flight Prediction using Machine Learning

Project Overview

This project predicts whether a flight will be delayed or on time using various Machine Learning algorithms.
We explore flight delay data, perform data preprocessing, train multiple classification models, and evaluate them using accuracy, ROC curves, and feature importance.


Using Requirements

pandas, numpy, matplotlib, seaborn, scikit-learn, xgboost


Dataset

  • Source: Airline on-time performance dataset (Kaggle-https://www.kaggle.com/datasets/ulrikthygepedersen/airlines-delay)

  • Features:

    • Airline – Airline carrier code
    • Flight – Flight number
    • Origin – Departure airport
    • Destination – Arrival airport
    • Distance – Distance between airports
    • Scheduled_Departure – Scheduled departure time
  • Target:

    • Delayed – 1 if delayed, 0 if on-time

Workflow

  1. Import Libraries
  2. Exploratory Data Analysis (EDA)
    • Summary statistics
    • Class distribution check
    • Outlier detection
    • Correlation heatmaps
  3. Data Preprocessing
    • Handling missing values
    • Encoding categorical features
    • Normalization/Scaling
  4. Model Training
    • Decision Tree Classifier
    • Random Forest Classifier
    • Logistic Regression
    • XGBoost Classifier
    • AdaBoost Classifier
  5. Model Evaluation
    • Accuracy Score
    • ROC Curve Visualization
    • Feature Importance Plot

Results

Model Accuracy
Decision Tree 68%
Random Forest 70%
Logistic Regression 58%
XGBoost 66%
AdaBoost 62%

Key Visualizations

  • Class Distribution
Screenshot 2025-08-10 081305 Screenshot 2025-08-10 081325
  • Correlation Heatmap
Screenshot 2025-08-10 081341
  • ROC Curves
Screenshot 2025-08-10 081410
  • Feature Importance (Random Forest)
Screenshot 2025-08-10 081427

Run Jupyter Notebook

jupyter notebook Airlines_Delay_Flight_Prediction_using_ML.ipynb

About

This project predicts whether a flight will be delayed or on time using various Machine Learning algorithms. We explore flight delay data, perform data preprocessing, train multiple classification models, and evaluate them using accuracy, ROC curves, and feature importance.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published