Skip to content

Commit a4d6276

Browse files
Refactoring to sync all MLFlow models and deploy using Helm (#6)
* Refactoring * split k8s resource manifests, wrap into helm, deploy with makefile * fix env pass * fix env pass (2) * fix linter * add more logging, fix deployment sync, adjust resources * lower model name * fix logging, sleep * await _deploy_model * switch cluster, add comments with assumptions, fix sync logic * fix wrong try\except wrapper * add secret creation * change login process * return neuro-extras dependency * return neuro-extras dependency(2) * move login into bash script * fix init container * switch cluster within init container * decrease logging level * minor changes - add delete command, handle k8s signals in api * fix function name * a bit more fixes in termination handler * lint * raise instead of exit in signal handler * python3.9 works * minor fix * add readme * bump installation commit hash * handle mlflow server unreachability / sso enabled * mention possibility to use another docker image for deployment * add dockerdeploy, ci/cd, precommit hooks and update deps * fix mypy * do not release pypi package, simplify cd * address artemlops review comments
1 parent fdde69b commit a4d6276

28 files changed

+777
-519
lines changed

.github/workflows/cd.yaml

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
name: CI
2+
3+
on:
4+
release:
5+
types: [published]
6+
7+
jobs:
8+
9+
info:
10+
name: Prepare release
11+
runs-on: ubuntu-latest
12+
if: github.event_name == 'release'
13+
timeout-minutes: 5
14+
outputs:
15+
version: ${{ steps.version.outputs.version }}
16+
name: ${{ steps.version.outputs.name }}
17+
steps:
18+
- name: Checkout commit
19+
uses: actions/checkout@v2
20+
- name: Setup Python 3.9
21+
uses: actions/setup-python@v2
22+
with:
23+
python-version: '3.9'
24+
- name: Cache PyPI
25+
uses: actions/[email protected]
26+
with:
27+
key: pip-lint-${{ hashFiles('requirements/*.txt') }}
28+
path: ~/.cache/pip
29+
restore-keys: |
30+
pip-lint-
31+
- name: Install dependencies
32+
run: |
33+
python -m pip install pip
34+
make setup
35+
- name: Save the package version
36+
id: version
37+
run: |
38+
echo "::set-output name=version::$(python setup.py --version)"
39+
echo "::set-output name=name::$(python setup.py --name)"
40+
- name: Show version
41+
run: |
42+
echo ${{ steps.version.outputs.name }}==${{ steps.version.outputs.version }}
43+
44+
dockerhub_deploy:
45+
name: Release image if needed
46+
runs-on: ubuntu-latest
47+
needs: [info]
48+
if: github.event_name == 'release'
49+
env:
50+
DOCKER_SERVER: docker.io
51+
DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
52+
DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
53+
steps:
54+
- name: Checkout commit
55+
uses: actions/checkout@v2
56+
- name: Install python
57+
uses: actions/setup-python@v2
58+
with:
59+
python-version: '3.9'
60+
- name: Build release image
61+
run: |
62+
export PACKAGE="${{ needs.info.outputs.name }}==${{ needs.info.outputs.version }}"
63+
docker build -t neuromation/${{ needs.info.outputs.name }}:latest .
64+
- name: Push release release image
65+
id: push
66+
run: |
67+
export IMAGE=neuromation/${{ needs.info.outputs.name }}
68+
export TAG=${{ needs.info.outputs.version }}
69+
70+
echo "::set-output name=image::$IMAGE"
71+
echo "::set-output name=tag::$TAG"
72+
73+
docker login $DOCKER_SERVER --username $DOCKER_USERNAME --password $DOCKER_PASSWORD
74+
docker tag $IMAGE:latest $IMAGE:$TAG
75+
docker push $IMAGE:$TAG
76+
# Push latest if it's not an alpha release:
77+
# not alpha: TAG="20.9.29"
78+
# alpha: TAG="20.9.29a1"
79+
if [[ ! "$TAG" =~ ^.*a[0-9]*$ ]]; then
80+
docker $IMAGE:latest
81+
fi
82+
- name: Wait until image becomes available on DockerHub
83+
timeout-minutes: 10
84+
run: |
85+
export IMAGE=${{ steps.push.outputs.image }}
86+
export TAG=${{ steps.push.outputs.tag }}
87+
88+
docker_tag_exists () {
89+
CT="Content-Type: application/json"
90+
U=$DOCKER_USERNAME
91+
P=$DOCKER_PASSWORD
92+
URL=https://hub.docker.com
93+
TOKEN=$(curl -s -H "$CT" -X POST -d '{"username": "'$U'", "password": "'$P'"}' $URL/v2/users/login/ | jq -r .token)
94+
curl --silent -f --head -lL $URL/v2/repositories/$1/tags/$2/ > /dev/null
95+
}
96+
97+
until docker_tag_exists $IMAGE $TAG
98+
do
99+
echo "Waiting for the dockerhub image $IMAGE:$TAG ..."
100+
sleep 1
101+
done

.github/workflows/ci.yaml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: "CI"
2+
3+
on:
4+
push:
5+
branches: [master]
6+
pull_request:
7+
branches: [master]
8+
9+
jobs:
10+
11+
lint:
12+
name: Linter
13+
runs-on: ubuntu-latest
14+
timeout-minutes: 5
15+
steps:
16+
- name: Checkout commit
17+
uses: actions/checkout@v2
18+
- name: Setup Python 3.9
19+
uses: actions/setup-python@v2
20+
with:
21+
python-version: '3.9'
22+
- name: Cache PyPI
23+
uses: actions/[email protected]
24+
with:
25+
key: pip-lint-${{ hashFiles('requirements/*.txt') }}
26+
path: ~/.cache/pip
27+
restore-keys: |
28+
pip-lint-
29+
- name: Install dependencies
30+
run: |
31+
python -m pip install pip
32+
make setup
33+
- name: Run linters
34+
run: |
35+
make lint

.pre-commit-config.yaml

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
repos:
2+
- repo: https://github.com/pre-commit/pre-commit-hooks
3+
rev: 'v3.4.0'
4+
hooks:
5+
- id: check-merge-conflict
6+
exclude: "rst$"
7+
- repo: https://github.com/asottile/yesqa
8+
rev: v1.2.3
9+
hooks:
10+
- id: yesqa
11+
- repo: https://github.com/pre-commit/mirrors-isort
12+
rev: 'v5.8.0'
13+
hooks:
14+
- id: isort
15+
- repo: https://github.com/psf/black
16+
rev: '21.4b1'
17+
hooks:
18+
- id: black
19+
language_version: python3 # Should be a command that runs python3.6+
20+
- repo: https://github.com/pre-commit/pre-commit-hooks
21+
rev: 'v3.4.0'
22+
hooks:
23+
- id: check-case-conflict
24+
- id: check-json
25+
- id: check-xml
26+
- id: debug-statements
27+
- id: check-added-large-files
28+
- id: end-of-file-fixer
29+
- id: requirements-txt-fixer
30+
- id: trailing-whitespace
31+
- id: check-symlinks
32+
- id: debug-statements
33+
# Another entry is required to apply file-contents-sorter to another file
34+
- repo: https://github.com/pre-commit/pre-commit-hooks
35+
rev: 'v3.4.0'
36+
hooks:
37+
- id: file-contents-sorter
38+
files: |
39+
docs/spelling_wordlist.txt|
40+
.gitignore
41+
# - repo: https://github.com/asottile/pyupgrade
42+
# rev: 'v2.15.0'
43+
# hooks:
44+
# - id: pyupgrade
45+
# args: ['--py36-plus']
46+
- repo: https://gitlab.com/pycqa/flake8
47+
rev: '3.9.1'
48+
hooks:
49+
- id: flake8

Dockerfile

Lines changed: 11 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,11 @@
1-
#FROM python:3.7
2-
#
3-
## Install code
4-
#COPY platform_integrations /opt/platform_integrations
5-
#RUN pip install -U pip \
6-
# && pip install -e /opt/platform_integrations
7-
#
8-
## Install kubectl
9-
#RUN cd /tmp \
10-
# && curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.20.1/bin/linux/amd64/kubectl \
11-
# && chmod +x ./kubectl \
12-
# && mv ./kubectl /usr/local/bin/kubectl \
13-
# && kubectl version --client
14-
#
15-
### Install neuro clis
16-
##RUN pip install -U \
17-
## neuro-cli==20.12.16 \
18-
## neuro-extras==20.12.16
19-
#
20-
#
21-
#ENV NP_INTEGRATIONS_API_PORT=8080
22-
#EXPOSE $NP_INTEGRATIONS_API_PORT
23-
#
24-
#CMD platform-integrations
1+
FROM python:3.9
2+
3+
RUN curl --output /usr/local/bin/kubectl -L \
4+
https://storage.googleapis.com/kubernetes-release/release/v1.20.1/bin/linux/amd64/kubectl && \
5+
chmod +x /usr/local/bin/kubectl;
6+
7+
# package version is to be overloaded with exact version
8+
WORKDIR /app
9+
COPY . /app
10+
11+
RUN pip install -e .

Makefile

Lines changed: 36 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,45 @@
1-
IMAGE = platformintegrations:latest
2-
IMAGE_ALIAS ?= image:/onprem-poc/artemyushkovskiy/$(IMAGE)
3-
# IMAGE_REF ?= registry.onprem-poc.org.neu.ro/artemyushkovskiy/$(IMAGE)
1+
HELM_CHART = mlflow2seldon
2+
3+
SVC_DEPLOYMENT_NAMESPACE = mlops-integrations
4+
5+
M2S_MLFLOW_NEURO_TOKEN ?= $(shell neuro config show-token)
6+
M2S_MLFLOW_HOST ?= $(shell read -p "MLFlow server hostname: " x; echo $$x)
7+
M2S_MLFLOW_STORAGE_ROOT ?= $(shell read -p "MLFlow artifact root path on storage: " x; echo $$x)
8+
M2S_SELDON_NEURO_DEF_IMAGE ?= $(shell read -p "Default Seldon deployment image name: " x; echo $$x)
9+
M2S_SRC_NEURO_CLUSTER ?= $(shell read -p "Neuro cluster where MLFlow is running: " x; echo $$x)
10+
411

512
setup:
6-
pip install -U pip
7-
pip install -e .
813
pip install -r requirements/syntax.txt
14+
pre-commit install
915

1016
lint: format
11-
mypy platform_integrations tests setup.py
17+
mypy mlflow2seldon setup.py
1218

1319
format:
1420
pre-commit run --all-files --show-diff-on-failure
1521

16-
neuro_build:
17-
pip install -U neuro-extras
18-
neuro-extras image build -f Dockerfile . $(IMAGE_ALIAS)
22+
_helm_fetch:
23+
rm -rf temp_deploy
24+
mkdir -p temp_deploy/$(HELM_CHART)
25+
cp -Rf deploy/$(HELM_CHART) temp_deploy/
26+
find temp_deploy/$(HELM_CHART) -type f -name 'values*' -delete
27+
28+
_helm_expand_vars:
29+
export M2S_MLFLOW_NEURO_TOKEN=$(M2S_MLFLOW_NEURO_TOKEN); \
30+
export M2S_MLFLOW_HOST=$(M2S_MLFLOW_HOST); \
31+
export M2S_MLFLOW_STORAGE_ROOT=$(M2S_MLFLOW_STORAGE_ROOT); \
32+
export M2S_SELDON_NEURO_DEF_IMAGE=$(M2S_SELDON_NEURO_DEF_IMAGE); \
33+
export M2S_SRC_NEURO_CLUSTER=$(M2S_SRC_NEURO_CLUSTER); \
34+
neuro config switch-cluster $${M2S_SRC_NEURO_CLUSTER}; \
35+
cat deploy/$(HELM_CHART)/values-make.yaml | envsubst > temp_deploy/$(HELM_CHART)/values-make.yaml
36+
cp deploy/$(HELM_CHART)/values.yaml > temp_deploy/$(HELM_CHART)/values.yaml
37+
helm lint temp_deploy/$(HELM_CHART)
38+
39+
helm_deploy: _helm_fetch _helm_expand_vars
40+
helm upgrade $(HELM_CHART) \
41+
temp_deploy/$(HELM_CHART) -f temp_deploy/$(HELM_CHART)/values.yaml -f temp_deploy/$(HELM_CHART)/values-make.yaml \
42+
--create-namespace --namespace $(SVC_DEPLOYMENT_NAMESPACE) --install --wait --timeout 600s
43+
44+
helm_delete:
45+
helm uninstall --namespace $(SVC_DEPLOYMENT_NAMESPACE) $(HELM_CHART)

README.md

Lines changed: 42 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,44 @@
1-
# mlops-kube-gateway
1+
# Neu.ro MLFlow2Seldon deployer
22

3-
Web service for integration of third-party tools that require direct access to k8s (like Seldon, Algorithmia, MLFlow) with Neu.ro platform.
3+
An integration service to deploy [MLFlow registered model](https://www.mlflow.org/docs/latest/model-registry.html) as REST/GRPC API to Kubernetes cluster using Seldon-core.
44

5-
Maintained by Neu.ro MLOps team.
5+
# Usage
6+
This service is running inside of the Kubernetes cluster, where the Seldon-core is deployed.
7+
By constantly fetching the MLFlow server registered models (running as a platform job) via MLFlow Python SDK, it synchronizes the MLFlow state to Seldon-core within the Kubernetes cluster.
8+
9+
For instance, if the MLFlow registered model version gets assigned to the Staging/Production stage, the corresponding model binary gets deployed from the MLFlow into the K8s cluster as the SeldonDeployment (exposing REST/GRPC APIs).
10+
If the stage assignment gets removed/updated - the corresponding SeldonDeployment is changed respectively.
11+
12+
Given that, all the interaction with the service is done implicitly via the MLFlow server state. There is no need to execute particular commands/workloads against this service directly.
13+
14+
## Prerequisites and usage assumptions
15+
- MLFlow
16+
- is up and running as a [platform job](https://github.com/neuro-actions/mlflow)
17+
- disabled platform SSO;
18+
- artifact store as a platform storage, mounted as local path;
19+
- mlflow server version is at least `1.11.0`;
20+
- Seldon
21+
- SeldonDeployment container image ([model wrapper](https://docs.seldon.io/projects/seldon-core/en/stable/python/python_wrapping_docker.html)) should be stored in the platform registry, on the same cluster where MLFlow is runnnig;
22+
- `kubectl` tool at the time of this service deployment should be authenticated to communicate with a Kubernetes cluster, where Seldon is deployed;
23+
- seldon-core-operator version is at least `1.5.0`;
24+
25+
## Deployment
26+
- `make helm_deploy` - will ask one several questions (e.g. what is the MLFlow URL, which Neu.ro cluster should be considered, etc.). Alternatively, one might also set the following env vars:
27+
- `M2S_MLFLOW_HOST` - MLFlow server host name (example: _https://mlflow--user.jobs.cluster.org.neu.ro_)/;
28+
- `M2S_MLFLOW_STORAGE_ROOT` - artifact root path in the platform storage (_storage:myproject/mlruns_);
29+
- `M2S_SELDON_NEURO_DEF_IMAGE` - docker image, stored in a platform registry, which will be used to deploy the model (_image:myproject/seldon:v1_). Alternatively, one might configure service to use another platform image for deployment by tagging the respective registerred model (not a model version (!) ) with the tag named after `M2S_MLFLOW_DEPLOY_IMG_TAG` chart parameter value (for instance, with a tag named "_deployment-image_" and the value "_image:myproject/seldon:v2_);
30+
- `M2S_SRC_NEURO_CLUSTER` - Neu.ro cluster, where deployment image, MLflow artifacts and MLFlow itself are hosted (_demo_cluster_);
31+
- Direct use of the helm chart is possible, however less comfortable - all requested by makefile info should be passed as chart values.
32+
33+
## Cleanup
34+
- `make helm_delete` - will delete:
35+
- all created by this helm chart resources, required for this service and the service itself;
36+
- Kubernetes namespace (and as a result all the resources within it), where SeldonDeployments were creating (M2S_SELDON_DEPLOYMENT_NS);
37+
38+
39+
# Got questions or suggestions?
40+
41+
42+
Feel free to contact us via [:email:](mailto:[email protected]) or @ [slack](https://neuro-community.slack.com/).
43+
44+
Maintained by [Neu.ro](https://neu.ro) MLOps team with :heart:

demo.sh

Lines changed: 0 additions & 14 deletions
This file was deleted.

deploy.sh

Lines changed: 0 additions & 39 deletions
This file was deleted.

deploy/mlflow2seldon/Chart.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
apiVersion: v2
2+
name: mlflow2seldon
3+
description: A Helm chart for mlflow2seldon Kubernetes deployment
4+
type: application
5+
version: 0.0.1
6+
appVersion: 0.0.1

0 commit comments

Comments
 (0)