You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/5. Refining/5.0. Design Patterns.md
+39-37Lines changed: 39 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,55 +6,54 @@ description: Explore common software design patterns for MLOps, including Strate
6
6
7
7
## What is a software design pattern?
8
8
9
-
[Software design patterns](https://en.wikipedia.org/wiki/Software_design_pattern)are proven solutions to common problems encountered during software development. These patterns provide a template for how to solve a problem in a way that has been validated by other developers over time. The concept originated from architecture and was adapted to computer science to help developers design more efficient, maintainable, and reliable code. In essence, design patterns serve as blueprints for solving specific software design issues.
9
+
A [software design pattern](https://en.wikipedia.org/wiki/Software_design_pattern)is a reusable, well-tested solution to a common problem within a given context in software design. Think of it as a blueprint or a template for solving a specific kind of problem, not a finished piece of code. By using established patterns, you can build on the collective experience of other developers to create more efficient, maintainable, and robust applications.
10
10
11
-
## Why do you need software design patterns?
11
+
## Why are design patterns essential for MLOps?
12
12
13
-
-**Freedom of choice**: In the realm of AI/ML, flexibility and adaptability are paramount. Design patterns enable solutions to remain versatile, allowing for the integration of various options and methodologies without locking into a single approach.
14
-
-**Code Robustness**: Python's dynamic nature demands discipline from developers to ensure code robustness. Design patterns provide a structured approach to coding that enhances stability and reliability.
15
-
-**Developer productivity**: Employing the right design patterns can significantly boost developer productivity. These patterns facilitate the exploration of diverse solutions, enabling developers to achieve more in less time and enhance the overall value of their projects.
13
+
In AI/ML engineering, you constantly face challenges related to complexity, change, and scale. Design patterns provide a structured way to manage these challenges.
16
14
17
-
While Python's flexibility is one of its strengths, it can also lead to challenges in maintaining code robustness and reliability. Design patterns help to mitigate these challenges by improving code quality and leveraging proven strategies to refine and enhance the codebase.
15
+
-**Enhance Flexibility**: The AI/ML landscape is always evolving. A new model, data source, or framework might become available tomorrow. Patterns like the Strategy pattern allow you to design systems where components can be swapped out easily without rewriting the entire application.
16
+
-**Improve Code Quality**: Python's dynamic nature offers great flexibility, but it requires discipline to write stable and reliable code. Design patterns enforce structure and best practices, leading to a higher-quality codebase that is easier to debug and maintain.
17
+
-**Boost Productivity**: Instead of reinventing the wheel for common problems like object creation or component integration, you can use a proven pattern. This accelerates development, allowing you to focus on the unique, value-driving aspects of your project.
18
18
19
-
## What are the top design patterns to know?
19
+
## What are the most important design patterns in MLOps?
20
20
21
-
[Design patterns are typically categorized into three types](https://en.wikipedia.org/wiki/Software_design_pattern#Examples):
21
+
[Design patterns are typically categorized into three types](https://en.wikipedia.org/wiki/Software_design_pattern#Examples). For MLOps, a few patterns from each category are particularly vital.
The [Strategy pattern](https://en.wikipedia.org/wiki/Strategy_pattern) is crucial in MLOps for decoupling the objectives (what to do) from the methodologies (how to do it). For example, it allows for the interchange of different algorithms or frameworks (such as TensorFlow, XGBoost, or PyTorch) for model training without altering the underlying code structure. This pattern upholds the Open/Closed Principle, providing the flexibility needed to adapt to changing requirements, such as switching models or data sources based on runtime conditions.
25
+
The Strategy pattern is fundamental for MLOps. It enables you to define a family of algorithms, encapsulate each one, and make them interchangeable. This decouples *what* you want to do (e.g., train a model) from *how* you do it (e.g., using TensorFlow, PyTorch, or XGBoost). This pattern adheres to the Open/Closed Principle, allowing you to add new strategies (like a new modeling framework) without modifying the client code that uses them.
26
26
27
27

After establishing common interfaces, the [Factory pattern](https://en.wikipedia.org/wiki/Factory_method_pattern) plays a vital role in enabling runtime behavior modification of programs. It controls object creation, allowing for dynamic adjustments through external configurations. In MLOps, this translates to the ability to alter AI/ML pipeline settings without code modifications. Python's dynamic features, combined with utilities like [Pydantic](https://docs.pydantic.dev/latest/), facilitate the implementation of the Factory pattern by simplifying user input validation and object instantiation.
31
+
The Factory pattern provides a way to create objects without exposing the creation logic to the client. In MLOps, this is incredibly useful for building pipelines that can be configured externally. For example, a factory can read a configuration file (e.g., a YAML file) to determine which type of model, data preprocessor, or evaluation component to instantiate at runtime. This makes your pipelines dynamic and configurable without requiring code changes.
32
32
33
33

The [Adapter pattern](https://en.wikipedia.org/wiki/Adapter_pattern) is indispensable in MLOps due to the diversity of standards and interfaces in the field. It provides a means to integrate various external components, such as training and inference systems across different platforms (e.g., Databricks and Kubernetes), by bridging incompatible interfaces. This ensures seamless integration and the generalization of external components, allowing for smooth communication and operation between disparate systems.
37
+
The Adapter pattern acts as a bridge between two incompatible interfaces. The MLOps ecosystem is filled with diverse tools and platforms, each with its own API. An adapter can wrap an existing class with a new interface, allowing it to work with other components seamlessly. For instance, you could use an adapter to make a model trained in Databricks compatible with a serving system running on Kubernetes, ensuring smooth integration between different parts of your stack.
38
38
39
39

40
40
41
-
## How can you define software interfaces with Python?
41
+
## How can you define software interfaces in Python?
42
42
43
-
Python supports two primary methods for defining interfaces: [Abstract Base Classes (ABC)](https://docs.python.org/3/library/abc.html) and [Protocols](https://peps.python.org/pep-0544/).
43
+
In Python, an interface defines a contract for what methods a class should implement. This is key to patterns like Strategy, where different implementations must conform to a single API. Python offers two main ways to define interfaces: [Abstract Base Classes (ABC)](https://docs.python.org/3/library/abc.html) and [Protocols](https://peps.python.org/pep-0544/).
44
44
45
-
[ABCs](https://docs.python.org/3/library/abc.html) utilize[Nominal Typing](https://en.wikipedia.org/wiki/Nominal_type_system) to establish clear class hierarchies and relationships, such as a RandomForestModel being a subtype of a Model. This approach makes the connection between classes explicit:
45
+
**Abstract Base Classes (ABCs)** use[Nominal Typing](https://en.wikipedia.org/wiki/Nominal_type_system), where a class must explicitly inherit from the ABC to be considered a subtype. This creates a clear, formal relationship.
Conversely, [Protocols](https://peps.python.org/pep-0544/) adhere to the [Structural Typing](https://en.wikipedia.org/wiki/Structural_type_system) principle, embodying [Python's duck typing philosophy](https://en.wikipedia.org/wiki/Duck_typing) where a class is considered compatible if it implements certain methods, regardless of its place in the class hierarchy. This means a RandomForestModel is recognized as a Model by merely implementing the expected behaviors.
77
+
**Protocols** use [Structural Typing](https://en.wikipedia.org/wiki/Structural_type_system), which aligns with Python's "duck typing" philosophy. A class conforms to a protocol if it has the right methods and signatures, regardless of its inheritance.
79
78
80
79
```python
81
80
from typing import Protocol, runtime_checkable
@@ -97,6 +96,7 @@ class RandomForestModel:
97
96
print("Predicting with RandomForestModel...")
98
97
return pd.DataFrame()
99
98
99
+
# This works because SVMModel has the required 'fit' and 'predict' methods.
Choosing between ABCs and Protocols depends on your project's needs. ABCs offer a more explicit, structured approach suitable for applications, while Protocols offer flexibility and are more aligned with library development.
109
+
| Feature | Abstract Base Classes (ABCs) | Protocols |
|**Relationship**| Explicit and clear hierarchy | Implicit, based on structure |
114
+
|**Best For**| Applications where you control the class hierarchy. | Libraries where you want to support classes you don't own. |
110
115
111
-
## How can you better validate and instantiate your objects?
116
+
## How can you simplify object validation and creation?
112
117
113
-
[Pydantic](https://docs.pydantic.dev/latest/) is a valuable tool for defining, validating, and instantiating objects according to specified requirements. It utilizes type annotations to ensure inputs meet predefined criteria, significantly reducing the risk of errors in data-driven operations, such as in MLOps processes.
118
+
[Pydantic](https://docs.pydantic.dev/latest/) is an essential library for modern Python development that uses type annotations for data validation and settings management. It is especially powerful in MLOps for ensuring data integrity and simplifying the implementation of creational patterns.
114
119
115
120
### Validating Objects with Pydantic
116
121
117
-
[Pydantic](https://docs.pydantic.dev/latest/) utilizes Python's type hints to validate data, ensuring that the objects you create adhere to your specifications from the get-go. [This feature is particularly valuable in MLOps](https://fmind.medium.com/make-your-mlops-code-base-solid-with-pydantic-and-pythons-abc-aeedfe9c3e65), where data integrity is crucial for the success of machine learning models. Here's how you can leverage Pydantic for object validation:
122
+
Pydantic models validate data on initialization. This ensures that any configuration or data object meets your requirements before it's used in a pipeline, preventing runtime errors.
In this example, Pydantic ensures that `n_estimators` is greater than 0, `max_depth` is either greater than 0 or `None`, and similarly for `random_state`. This kind of validation is essential for maintaining the integrity of your model training processes.
133
-
134
-
### Streamlining Object Instantiation with Discriminated Union
139
+
### Streamlining Object Creation with Discriminated Unions
135
140
136
-
[Pydantic's Discriminated Union](https://docs.pydantic.dev/latest/concepts/unions/#discriminated-unions)feature further simplifies object instantiation, allowing you to dynamically select a class based on a specific attribute (e.g.,`KIND`). This approach can serve as an efficient alternative to the traditional Factory pattern, reducing the need for boilerplate code:
141
+
[Pydantic's Discriminated Unions](https://docs.pydantic.dev/latest/concepts/unions/#discriminated-unions)provide a powerful and concise way to implement a factory-like behavior. You can define a union of different Pydantic models and select the correct one at runtime based on a "discriminator" field (like`KIND`). This is often cleaner than a traditional Factory pattern.
137
142
138
143
```python
139
144
from typing import Literal, Union
@@ -171,12 +176,9 @@ config = {
171
176
}
172
177
job = Job.model_validate(config)
173
178
```
179
+
This approach makes your code both robust and highly flexible, allowing you to build data-driven systems that are easy to configure and extend.
174
180
175
-
This pattern not only makes the instantiation of objects based on dynamic input straightforward but also ensures that each instantiated object is immediately validated against its respective schema, further enhancing the robustness of your application.
176
-
177
-
Incorporating these practices into your MLOps projects can significantly improve the reliability and maintainability of your code, ensuring that your machine learning pipelines are both efficient and error-resistant.
178
-
179
-
## Design pattern additional resources
181
+
## Additional Resources
180
182
181
183
-**[Design pattern examples from the MLOps Python Package](https://github.com/fmind/mlops-python-package/tree/main/src/bikes)**
182
184
-**[Stop Building Rigid AI/ML Pipelines: Embrace Reusable Components for Flexible MLOps](https://fmind.medium.com/stop-building-rigid-ai-ml-pipelines-embrace-reusable-components-for-flexible-mlops-6e165d837110)**
0 commit comments