MLOps Level1 - Continuous Training Pipeline

This Project showcase Automation of ML workflow from scratch.

ETL pipeline from API to Master tables.
Feature Engineering Pipeline.
Automated ML model training pipeline.
Continuous Training: Transition staging model to production model, followed by triggering github action to deploy model to AWS Sagemaker Endpoint.

MLOps Level 1 - System Architecture

Technologies Used

Apache Airflow: Workflow orchestration
MLflow: Model tracking and model lifecycle management
AWS SageMaker: Model deployment
Docker: Containerization
GitHub Actions: CI/CD pipeline

Tutorial

Watch the step-by-step tutorial on YouTube: MLOps Level 1 - Continuous Training Pipeline Tutorial

Prerequisites

Python 3.12+
Docker and Docker Compose
AWS Account with SageMaker access
API credentials for weather data
Git

Project Structure

.
├── dags/              # Airflow DAGs for workflow orchestration
├── mlflow/            # MLflow tracking and model management
├── config/            # Configuration files
├── plugins/           # Custom Airflow plugins
├── images/            # Documentation images
└── requirements.txt   # Python dependencies

Pipeline Components

1. ETL Pipeline

Fetches weather data from API
Transforms and loads data into master tables
Handles data validation and error checking

2. Feature Engineering

Processes raw data into model features
Handles missing values and outliers
Creates time-based features

3. Model Training

Automated training pipeline
Model versioning with MLflow
Performance tracking and comparison

4. Model Deployment

GitHub Actions workflow for automated deployment: deploy-model-sagemaker

Monitoring and Maintenance

Monitor pipeline health through Airflow UI
Track model performance in MLflow
Set up alerts for pipeline failures
Regular model retraining based on date

Troubleshooting

Common issues and solutions:

Pipeline failures
- Check Airflow logs
- Verify API connectivity
- Ensure AWS credentials are valid
Model deployment issues
- Verify SageMaker endpoint status
- Check model artifacts in MLflow
- Review deployment logs

Reference

Input Data Example (Weather API Response)

{
    "response": {
        "header": {
            "resultCode": "00",
            "resultMsg": "NORMAL_SERVICE"
        },
        "body": {
            "dataType": "JSON",
            "items": {
                "item": [
                    {
                        "baseDate": "20250313",
                        "baseTime": "0000",
                        "category": "PTY",
                        "nx": 86,
                        "ny": 106,
                        "obsrValue": "0"
                    },
                    {
                        "baseDate": "20250313",
                        "baseTime": "0000",
                        "category": "RN1",
                        "nx": 86,
                        "ny": 106,
                        "obsrValue": "0"
                    },
                    {
                        "baseDate": "20250313",
                        "baseTime": "0000",
                        "category": "T1H",
                        "nx": 86,
                        "ny": 106,
                        "obsrValue": "6.4"
                    },
                ]
            },
            "pageNo": 1,
            "numOfRows": 10,
            "totalCount": 8
        }
    }
}

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
config		config
dags		dags
images		images
mlflow		mlflow
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
create_tables.sql		create_tables.sql
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt
weather-2025-03-20T23_41_14.sql		weather-2025-03-20T23_41_14.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLOps Level1 - Continuous Training Pipeline

MLOps Level 1 - System Architecture

Technologies Used

Tutorial

Prerequisites

Project Structure

Pipeline Components

1. ETL Pipeline

2. Feature Engineering

3. Model Training

4. Model Deployment

Monitoring and Maintenance

Troubleshooting

Reference

Input Data Example (Weather API Response)

Data Model (ERD) - Master Tables

About

Uh oh!

Releases

Packages

Uh oh!

Languages

sunse-kwon/MLOps-continuous-training-pipeline

Folders and files

Latest commit

History

Repository files navigation

MLOps Level1 - Continuous Training Pipeline

MLOps Level 1 - System Architecture

Technologies Used

Tutorial

Prerequisites

Project Structure

Pipeline Components

1. ETL Pipeline

2. Feature Engineering

3. Model Training

4. Model Deployment

Monitoring and Maintenance

Troubleshooting

Reference

Input Data Example (Weather API Response)

Data Model (ERD) - Master Tables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages