|
| 1 | +import ReactPlayer from 'react-player' |
| 2 | + |
| 3 | +# Scheduling Metaflow Flows with Kubeflow |
| 4 | + |
| 5 | +:::info Note |
| 6 | +The Kubeflow integration is new and currently released as a Metaflow extension |
| 7 | +which you need to install separately as instructed below. The feature set is |
| 8 | +still evolving and hence subject to change. Reach out to us on [Metaflow Slack](http://slack.outerbounds.co) if you have questions about the integration! |
| 9 | +::: |
| 10 | + |
| 11 | +Metaflow’s Kubeflow integration lets you develop workflows using Metaflow’s full, developer-friendly feature set, and then deploy those flows to your existing Kubeflow infrastructure, where they run seamlessly alongside your current Kubeflow pipelines. |
| 12 | + |
| 13 | +The deployed pipelines are **both valid Metaflow flows and Kubeflow pipelines**, |
| 14 | +allowing you to observe them in real time in the Kubeflow UI and in the Metaflow UI, |
| 15 | +and to access results through Metaflow’s [Client API](/metaflow/client) as usual. |
| 16 | + |
| 17 | +You can see this in action in this short screencast (no sound): |
| 18 | + |
| 19 | +<ReactPlayer controls url="https://youtu.be/ALg0A9SzRG8" /> |
| 20 | +<br/> |
| 21 | + |
| 22 | +### Why use Metaflow with Kubeflow |
| 23 | + |
| 24 | +The video highlights the main benefits of the integration: |
| 25 | + |
| 26 | +- Metaflow provides [**a top-notch developer |
| 27 | + experience**](https://netflixtechblog.com/supercharging-the-ml-and-ai-development-experience-at-netflix-b2d5b95c63eb), |
| 28 | + sparing the developer from Kubernetes-specific technical details. |
| 29 | + |
| 30 | +- You can [test flows locally at arbitrary scale](/scaling/introduction) |
| 31 | + using **the same Kubernetes infrastructure** that you use with Kubeflow. |
| 32 | + |
| 33 | +- Most Metaflow features, such as |
| 34 | + [resuming](/metaflow/debugging#how-to-use-the-resume-command), |
| 35 | + [observability with cards](/metaflow/visualizing-results), |
| 36 | + [dependency management](/scaling/dependencies), |
| 37 | + [configuration management](/metaflow/configuring-flows/introduction), |
| 38 | + [namespaces](/scaling/tagging), |
| 39 | + [artifacts and Client API](/metaflow/client), |
| 40 | + work out of the box, **greatly enhancing the functionality available |
| 41 | + in Kubeflow Pipelines**. |
| 42 | + |
| 43 | +- **Deploy to Kubeflow Pipelines with a single command**, no changes |
| 44 | + in the Metaflow code required. |
| 45 | + |
| 46 | +Note that Kubeflow Pipelines is built on top of Argo Workflows, so its scalability and |
| 47 | +high-availability characteristics closely mirror those of Metaflow’s native |
| 48 | +[Argo Workflows integration](/production/scheduling-metaflow-flows/scheduling-with-argo-workflows), with potential additional overhead |
| 49 | +introduced by the Kubeflow components. |
| 50 | + |
| 51 | +## Setting up the Kubeflow-Metaflow integration |
| 52 | + |
| 53 | +Currently, the integration is provided as a Metaflow Extension which you can |
| 54 | +install as follows |
| 55 | + |
| 56 | +``` |
| 57 | +pip install metaflow-kubeflow |
| 58 | +``` |
| 59 | + |
| 60 | +Note that you have to install the extension only on the laptop, server, or |
| 61 | +CI/CD worker where you deploy workflows. It doesn't need to be present on |
| 62 | +containers executing tasks. |
| 63 | + |
| 64 | +:::info Note |
| 65 | +In order to be able to deploy to Kubeflow, you need to be able to connect to |
| 66 | +Kubeflow. In case you don't have connectivity already set up, [see this README |
| 67 | +for suggestions](https://github.com/outerbounds/metaflow-kubeflow). |
| 68 | +::: |
| 69 | + |
| 70 | +You can specify the Kubeflow endpoint address every time you deploy, or you can |
| 71 | +add it in your Metaflow config (typically at `~/.metaflowconfig/config.json`): |
| 72 | +``` |
| 73 | + "METAFLOW_KUBEFLOW_PIPELINES_URL": "http://my-kubeflow", |
| 74 | +``` |
| 75 | +Replace `http://my-kubeflow` with the actual address. |
| 76 | + |
| 77 | +## Pushing a flow to Kubeflow Pipelines |
| 78 | + |
| 79 | +Let's use [the flow from the section about |
| 80 | +parameters](../../metaflow/basics#how-to-define-parameters-for-flows) as an example: |
| 81 | + |
| 82 | +```python |
| 83 | +from metaflow import FlowSpec, Parameter, step |
| 84 | + |
| 85 | +class ParameterFlow(FlowSpec): |
| 86 | + alpha = Parameter('alpha', |
| 87 | + help='Learning rate', |
| 88 | + default=0.01) |
| 89 | + |
| 90 | + @step |
| 91 | + def start(self): |
| 92 | + print('alpha is %f' % self.alpha) |
| 93 | + self.next(self.end) |
| 94 | + |
| 95 | + @step |
| 96 | + def end(self): |
| 97 | + print('alpha is still %f' % self.alpha) |
| 98 | + |
| 99 | +if __name__ == '__main__': |
| 100 | + ParameterFlow() |
| 101 | +``` |
| 102 | + |
| 103 | +Save the flow in a file `parameter_flow.py`. To deploy a version to Kubeflow Pipelines, |
| 104 | +type |
| 105 | +``` |
| 106 | +python parameter_flow.py kubeflow-pipelines create |
| 107 | +``` |
| 108 | +Or, if you haven't specified `METAFLOW_KUBEFLOW_PIPELINES_URL` in your Metaflow |
| 109 | +config, specify a Kubeflow URL on the command line: |
| 110 | +``` |
| 111 | +python parameter_flow.py kubeflow-pipelines create --url http://my-kubeflow |
| 112 | +``` |
| 113 | + |
| 114 | +This command will [package the flow and its execution |
| 115 | +environment](scaling/dependencies), convert it to a Kubeflow Pipeline, and deploy |
| 116 | +it to the specified server. |
| 117 | + |
| 118 | +The pipeline name matches the flow name. Each deployment created with |
| 119 | +`kubeflow-pipelines create` produces a new pipeline version, which by default is |
| 120 | +named using the current timestamp, unless you explicitly override the version name |
| 121 | +with the `--version-name` option. |
| 122 | + |
| 123 | +:::info Note |
| 124 | +[Conditional and recursive steps](/metaflow/basics#conditionals) |
| 125 | +introduced in Metaflow 2.18, are not yet supported |
| 126 | +on Kubeflow deployments. Contact [the Metaflow Slack](http://slack.outerbounds.co) if |
| 127 | +you have a use case for this feature. |
| 128 | +::: |
| 129 | + |
| 130 | +## Triggering a Kubeflow run |
| 131 | + |
| 132 | +You can trigger a deployed flow to run on Kubeflow pipelines with the `trigger` |
| 133 | +command: |
| 134 | +``` |
| 135 | +python parameter_flow.py kubeflow-pipelines trigger |
| 136 | +``` |
| 137 | +Specify `--url` as above, unless the URL is specified in the config. |
| 138 | + |
| 139 | +You can pass parameters to a run as usual: |
| 140 | +``` |
| 141 | +python parameter_flow.py kubeflow-pipelines trigger --alpha 0.5 |
| 142 | +``` |
| 143 | + |
| 144 | +You can also trigger a run as a specific [Kubeflow |
| 145 | +Experiment](https://www.kubeflow.org/docs/components/pipelines/concepts/experiment/): |
| 146 | +``` |
| 147 | +python parameter_flow.py kubeflow-pipelines trigger --alpha 0.5 --experiment new_moddel |
| 148 | +``` |
| 149 | +By default, the latest version of the flow is triggered. You can trigger an |
| 150 | +older version by specifying `--version-name`: |
| 151 | +``` |
| 152 | +python parameter_flow.py kubeflow-pipelines trigger --alpha 0.5 --version-name 20251216021104161376 |
| 153 | +``` |
| 154 | + |
| 155 | +:::info Note |
| 156 | +Currently only manual triggering with `trigger` is supported. If you are interested |
| 157 | +in [scheduled](/api/flow-decorators/schedule) or |
| 158 | +[event-triggered runs](/production/event-triggering), |
| 159 | +reach out to us on [Metaflow Slack](http://slack.outerbounds.co) |
| 160 | +::: |
| 161 | + |
| 162 | +## Inspecting the results of a Kubeflow run |
| 163 | + |
| 164 | +Every Kubeflow run is a valid Metaflow run which you can inspect using the Metaflow UI |
| 165 | +and [the Client API](/metaflow/client) as usual. |
| 166 | + |
| 167 | +Note that [Metaflow's namespaces](scaling/tagging#production-namespaces) apply on |
| 168 | +Kubeflow too, so to access Kubeflow results, you can switch to [the global |
| 169 | +namespace](/scaling/tagging#global-namespace), like here: |
| 170 | + |
| 171 | +```python |
| 172 | +from metaflow import Flow, namespace |
| 173 | +namespace(None) |
| 174 | +run = Flow('ParameterFlow')['kfp-066dcf8a-61dd-4f61-b652-e161124bc3b3'] |
| 175 | +print(f'Alpha is {run.data.alpha}') |
| 176 | +``` |
| 177 | + |
| 178 | +:::tip Run IDs match |
| 179 | +Metaflow Run IDs corresponding to Kubeflows runs match their Kubeflow |
| 180 | +run IDs, prefixed with `kfp-` (e.g. `kfp-066dcf8a-61dd-4f61-b652-e161124bc3b3`) so you can easily identify and track lineage of runs between Metaflow and Kubeflow. |
| 181 | +::: |
0 commit comments