This Project showcase Automation of ML workflow from scratch.
- ETL pipeline from API to Master tables.
- Feature Engineering Pipeline.
- Automated ML model training pipeline.
- Continuous Training: Transition staging model to production model, followed by triggering github action to deploy model to AWS Sagemaker Endpoint.
- Apache Airflow: Workflow orchestration
- MLflow: Model tracking and model lifecycle management
- AWS SageMaker: Model deployment
- Docker: Containerization
- GitHub Actions: CI/CD pipeline
Watch the step-by-step tutorial on YouTube: MLOps Level 1 - Continuous Training Pipeline Tutorial
- Python 3.12+
- Docker and Docker Compose
- AWS Account with SageMaker access
- API credentials for weather data
- Git
.
├── dags/ # Airflow DAGs for workflow orchestration
├── mlflow/ # MLflow tracking and model management
├── config/ # Configuration files
├── plugins/ # Custom Airflow plugins
├── images/ # Documentation images
└── requirements.txt # Python dependencies
- Fetches weather data from API
- Transforms and loads data into master tables
- Handles data validation and error checking
- Processes raw data into model features
- Handles missing values and outliers
- Creates time-based features
- Automated training pipeline
- Model versioning with MLflow
- Performance tracking and comparison
- GitHub Actions workflow for automated deployment: deploy-model-sagemaker
- Monitor pipeline health through Airflow UI
- Track model performance in MLflow
- Set up alerts for pipeline failures
- Regular model retraining based on date
Common issues and solutions:
-
Pipeline failures
- Check Airflow logs
- Verify API connectivity
- Ensure AWS credentials are valid
-
Model deployment issues
- Verify SageMaker endpoint status
- Check model artifacts in MLflow
- Review deployment logs
{
"response": {
"header": {
"resultCode": "00",
"resultMsg": "NORMAL_SERVICE"
},
"body": {
"dataType": "JSON",
"items": {
"item": [
{
"baseDate": "20250313",
"baseTime": "0000",
"category": "PTY",
"nx": 86,
"ny": 106,
"obsrValue": "0"
},
{
"baseDate": "20250313",
"baseTime": "0000",
"category": "RN1",
"nx": 86,
"ny": 106,
"obsrValue": "0"
},
{
"baseDate": "20250313",
"baseTime": "0000",
"category": "T1H",
"nx": 86,
"ny": 106,
"obsrValue": "6.4"
},
]
},
"pageNo": 1,
"numOfRows": 10,
"totalCount": 8
}
}
}

