This repo contains the code, images, and documentation to run a basic MLOps pipeline.
To have a better understanding I propose the following component diagram which depicts the components involved in the CI and CT pipelines.
The components and the packages of components depicted in the previous image are the ones to be tested through the unit and integrate tests before any commit would be merged into the master branch. This would help us to avoid merging untested changes and crashing the workflow. Specifically, I'm to test the following point of the workflow depicted in the image.
- In the data collection the versioned artifacts are created.
- In the data preprocessing the versioned artifacts are created.
- The model training doesn't overfit and returns the expected artifacts (e.g., confusion matrix, scores, metrics, and so on.)
- The workflow logic works properly.
- The model registry creates the new model version artifacts when corresponding.
- The model publishing creates the model artifacts in the production environment.
- Data drift and concept drift trigger the CT when corresponding.
