Skip to content

Commit d6b1a47

Browse files
committed
Update Chapter 5
1 parent 4476abc commit d6b1a47

File tree

8 files changed

+568
-437
lines changed

8 files changed

+568
-437
lines changed

docs/5. Refining/5.0. Design Patterns.md

Lines changed: 39 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -6,55 +6,54 @@ description: Explore common software design patterns for MLOps, including Strate
66

77
## What is a software design pattern?
88

9-
[Software design patterns](https://en.wikipedia.org/wiki/Software_design_pattern) are proven solutions to common problems encountered during software development. These patterns provide a template for how to solve a problem in a way that has been validated by other developers over time. The concept originated from architecture and was adapted to computer science to help developers design more efficient, maintainable, and reliable code. In essence, design patterns serve as blueprints for solving specific software design issues.
9+
A [software design pattern](https://en.wikipedia.org/wiki/Software_design_pattern) is a reusable, well-tested solution to a common problem within a given context in software design. Think of it as a blueprint or a template for solving a specific kind of problem, not a finished piece of code. By using established patterns, you can build on the collective experience of other developers to create more efficient, maintainable, and robust applications.
1010

11-
## Why do you need software design patterns?
11+
## Why are design patterns essential for MLOps?
1212

13-
- **Freedom of choice**: In the realm of AI/ML, flexibility and adaptability are paramount. Design patterns enable solutions to remain versatile, allowing for the integration of various options and methodologies without locking into a single approach.
14-
- **Code Robustness**: Python's dynamic nature demands discipline from developers to ensure code robustness. Design patterns provide a structured approach to coding that enhances stability and reliability.
15-
- **Developer productivity**: Employing the right design patterns can significantly boost developer productivity. These patterns facilitate the exploration of diverse solutions, enabling developers to achieve more in less time and enhance the overall value of their projects.
13+
In AI/ML engineering, you constantly face challenges related to complexity, change, and scale. Design patterns provide a structured way to manage these challenges.
1614

17-
While Python's flexibility is one of its strengths, it can also lead to challenges in maintaining code robustness and reliability. Design patterns help to mitigate these challenges by improving code quality and leveraging proven strategies to refine and enhance the codebase.
15+
- **Enhance Flexibility**: The AI/ML landscape is always evolving. A new model, data source, or framework might become available tomorrow. Patterns like the Strategy pattern allow you to design systems where components can be swapped out easily without rewriting the entire application.
16+
- **Improve Code Quality**: Python's dynamic nature offers great flexibility, but it requires discipline to write stable and reliable code. Design patterns enforce structure and best practices, leading to a higher-quality codebase that is easier to debug and maintain.
17+
- **Boost Productivity**: Instead of reinventing the wheel for common problems like object creation or component integration, you can use a proven pattern. This accelerates development, allowing you to focus on the unique, value-driving aspects of your project.
1818

19-
## What are the top design patterns to know?
19+
## What are the most important design patterns in MLOps?
2020

21-
[Design patterns are typically categorized into three types](https://en.wikipedia.org/wiki/Software_design_pattern#Examples):
21+
[Design patterns are typically categorized into three types](https://en.wikipedia.org/wiki/Software_design_pattern#Examples). For MLOps, a few patterns from each category are particularly vital.
2222

2323
### [Strategy Pattern](https://en.wikipedia.org/wiki/Strategy_pattern) ([Behavioral](https://en.wikipedia.org/wiki/Software_design_pattern#Behavioral_patterns))
2424

25-
The [Strategy pattern](https://en.wikipedia.org/wiki/Strategy_pattern) is crucial in MLOps for decoupling the objectives (what to do) from the methodologies (how to do it). For example, it allows for the interchange of different algorithms or frameworks (such as TensorFlow, XGBoost, or PyTorch) for model training without altering the underlying code structure. This pattern upholds the Open/Closed Principle, providing the flexibility needed to adapt to changing requirements, such as switching models or data sources based on runtime conditions.
25+
The Strategy pattern is fundamental for MLOps. It enables you to define a family of algorithms, encapsulate each one, and make them interchangeable. This decouples *what* you want to do (e.g., train a model) from *how* you do it (e.g., using TensorFlow, PyTorch, or XGBoost). This pattern adheres to the Open/Closed Principle, allowing you to add new strategies (like a new modeling framework) without modifying the client code that uses them.
2626

2727
![Strategy Pattern for MLOps](https://miro.medium.com/v2/resize:fit:828/format:webp/1*m9gMQClDj_FPmufgLeBz1Q.png)
2828

2929
### [Factory Pattern](https://en.wikipedia.org/wiki/Factory_method_pattern) ([Creational](https://en.wikipedia.org/wiki/Software_design_pattern#Creational_patterns))
3030

31-
After establishing common interfaces, the [Factory pattern](https://en.wikipedia.org/wiki/Factory_method_pattern) plays a vital role in enabling runtime behavior modification of programs. It controls object creation, allowing for dynamic adjustments through external configurations. In MLOps, this translates to the ability to alter AI/ML pipeline settings without code modifications. Python's dynamic features, combined with utilities like [Pydantic](https://docs.pydantic.dev/latest/), facilitate the implementation of the Factory pattern by simplifying user input validation and object instantiation.
31+
The Factory pattern provides a way to create objects without exposing the creation logic to the client. In MLOps, this is incredibly useful for building pipelines that can be configured externally. For example, a factory can read a configuration file (e.g., a YAML file) to determine which type of model, data preprocessor, or evaluation component to instantiate at runtime. This makes your pipelines dynamic and configurable without requiring code changes.
3232

3333
![Factory Pattern for MLOps](https://miro.medium.com/v2/resize:fit:1100/format:webp/1*_6PLXwRlhe-cj2x2adfiIw.png)
3434

3535
### [Adapter Pattern](https://en.wikipedia.org/wiki/Adapter_pattern) ([Structural](https://en.wikipedia.org/wiki/Software_design_pattern#Structural_patterns))
3636

37-
The [Adapter pattern](https://en.wikipedia.org/wiki/Adapter_pattern) is indispensable in MLOps due to the diversity of standards and interfaces in the field. It provides a means to integrate various external components, such as training and inference systems across different platforms (e.g., Databricks and Kubernetes), by bridging incompatible interfaces. This ensures seamless integration and the generalization of external components, allowing for smooth communication and operation between disparate systems.
37+
The Adapter pattern acts as a bridge between two incompatible interfaces. The MLOps ecosystem is filled with diverse tools and platforms, each with its own API. An adapter can wrap an existing class with a new interface, allowing it to work with other components seamlessly. For instance, you could use an adapter to make a model trained in Databricks compatible with a serving system running on Kubernetes, ensuring smooth integration between different parts of your stack.
3838

3939
![Adapter pattern for MLOps](https://miro.medium.com/v2/resize:fit:1100/format:webp/1*BkKAZsojOIrg8gF8kDQtTA.png)
4040

41-
## How can you define software interfaces with Python?
41+
## How can you define software interfaces in Python?
4242

43-
Python supports two primary methods for defining interfaces: [Abstract Base Classes (ABC)](https://docs.python.org/3/library/abc.html) and [Protocols](https://peps.python.org/pep-0544/).
43+
In Python, an interface defines a contract for what methods a class should implement. This is key to patterns like Strategy, where different implementations must conform to a single API. Python offers two main ways to define interfaces: [Abstract Base Classes (ABC)](https://docs.python.org/3/library/abc.html) and [Protocols](https://peps.python.org/pep-0544/).
4444

45-
[ABCs](https://docs.python.org/3/library/abc.html) utilize [Nominal Typing](https://en.wikipedia.org/wiki/Nominal_type_system) to establish clear class hierarchies and relationships, such as a RandomForestModel being a subtype of a Model. This approach makes the connection between classes explicit:
45+
**Abstract Base Classes (ABCs)** use [Nominal Typing](https://en.wikipedia.org/wiki/Nominal_type_system), where a class must explicitly inherit from the ABC to be considered a subtype. This creates a clear, formal relationship.
4646

4747
```python
4848
from abc import ABC, abstractmethod
49-
5049
import pandas as pd
5150

5251
class Model(ABC):
53-
@abstractmethod
52+
@abstractmethod
5453
def fit(self, X: pd.DataFrame, y: pd.DataFrame) -> None:
5554
pass
5655

57-
@abstractmethod
56+
@abstractmethod
5857
def predict(self, X: pd.DataFrame) -> pd.DataFrame:
5958
pass
6059

@@ -75,7 +74,7 @@ class SVMModel(Model):
7574
return pd.DataFrame()
7675
```
7776

78-
Conversely, [Protocols](https://peps.python.org/pep-0544/) adhere to the [Structural Typing](https://en.wikipedia.org/wiki/Structural_type_system) principle, embodying [Python's duck typing philosophy](https://en.wikipedia.org/wiki/Duck_typing) where a class is considered compatible if it implements certain methods, regardless of its place in the class hierarchy. This means a RandomForestModel is recognized as a Model by merely implementing the expected behaviors.
77+
**Protocols** use [Structural Typing](https://en.wikipedia.org/wiki/Structural_type_system), which aligns with Python's "duck typing" philosophy. A class conforms to a protocol if it has the right methods and signatures, regardless of its inheritance.
7978

8079
```python
8180
from typing import Protocol, runtime_checkable
@@ -97,6 +96,7 @@ class RandomForestModel:
9796
print("Predicting with RandomForestModel...")
9897
return pd.DataFrame()
9998

99+
# This works because SVMModel has the required 'fit' and 'predict' methods.
100100
class SVMModel:
101101
def fit(self, X: pd.DataFrame, y: pd.DataFrame) -> None:
102102
print("Fitting SVMModel...")
@@ -106,34 +106,39 @@ class SVMModel:
106106
return pd.DataFrame()
107107
```
108108

109-
Choosing between ABCs and Protocols depends on your project's needs. ABCs offer a more explicit, structured approach suitable for applications, while Protocols offer flexibility and are more aligned with library development.
109+
| Feature | Abstract Base Classes (ABCs) | Protocols |
110+
| :--- | :--- | :--- |
111+
| **Typing** | Nominal ("is-a" relationship) | Structural ("has-a" behavior) |
112+
| **Inheritance** | Required | Not required |
113+
| **Relationship** | Explicit and clear hierarchy | Implicit, based on structure |
114+
| **Best For** | Applications where you control the class hierarchy. | Libraries where you want to support classes you don't own. |
110115

111-
## How can you better validate and instantiate your objects?
116+
## How can you simplify object validation and creation?
112117

113-
[Pydantic](https://docs.pydantic.dev/latest/) is a valuable tool for defining, validating, and instantiating objects according to specified requirements. It utilizes type annotations to ensure inputs meet predefined criteria, significantly reducing the risk of errors in data-driven operations, such as in MLOps processes.
118+
[Pydantic](https://docs.pydantic.dev/latest/) is an essential library for modern Python development that uses type annotations for data validation and settings management. It is especially powerful in MLOps for ensuring data integrity and simplifying the implementation of creational patterns.
114119

115120
### Validating Objects with Pydantic
116121

117-
[Pydantic](https://docs.pydantic.dev/latest/) utilizes Python's type hints to validate data, ensuring that the objects you create adhere to your specifications from the get-go. [This feature is particularly valuable in MLOps](https://fmind.medium.com/make-your-mlops-code-base-solid-with-pydantic-and-pythons-abc-aeedfe9c3e65), where data integrity is crucial for the success of machine learning models. Here's how you can leverage Pydantic for object validation:
122+
Pydantic models validate data on initialization. This ensures that any configuration or data object meets your requirements before it's used in a pipeline, preventing runtime errors.
118123

119124
```python
120125
from typing import Optional
121126
from pydantic import BaseModel, Field
122127

123-
class RandomForestClassifierModel(BaseModel):
124-
n_estimators: int = Field(default=100, gt=0)
125-
max_depth: Optional[int] = Field(default=None, gt=0, allow_none=True)
126-
random_state: Optional[int] = Field(default=None, gt=0, allow_none=True)
128+
class RandomForestConfig(BaseModel):
129+
n_estimators: int = Field(default=100, description="Number of trees in the forest.", gt=0)
130+
max_depth: Optional[int] = Field(default=None, description="Maximum depth of the tree.", gt=0)
131+
random_state: Optional[int] = Field(default=42, description="Controls randomness.")
127132

128-
# Instantiate the model with validated parameters
129-
model = RandomForestClassifierModel(n_estimators=120, max_depth=5, random_state=42)
133+
# Pydantic automatically validates the data upon instantiation.
134+
# This would raise a validation error: RandomForestConfig(n_estimators=-10)
135+
config = RandomForestConfig(n_estimators=150, max_depth=10)
136+
print(config.model_dump_json(indent=2))
130137
```
131138

132-
In this example, Pydantic ensures that `n_estimators` is greater than 0, `max_depth` is either greater than 0 or `None`, and similarly for `random_state`. This kind of validation is essential for maintaining the integrity of your model training processes.
133-
134-
### Streamlining Object Instantiation with Discriminated Union
139+
### Streamlining Object Creation with Discriminated Unions
135140

136-
[Pydantic's Discriminated Union](https://docs.pydantic.dev/latest/concepts/unions/#discriminated-unions) feature further simplifies object instantiation, allowing you to dynamically select a class based on a specific attribute (e.g., `KIND`). This approach can serve as an efficient alternative to the traditional Factory pattern, reducing the need for boilerplate code:
141+
[Pydantic's Discriminated Unions](https://docs.pydantic.dev/latest/concepts/unions/#discriminated-unions) provide a powerful and concise way to implement a factory-like behavior. You can define a union of different Pydantic models and select the correct one at runtime based on a "discriminator" field (like `KIND`). This is often cleaner than a traditional Factory pattern.
137142

138143
```python
139144
from typing import Literal, Union
@@ -171,12 +176,9 @@ config = {
171176
}
172177
job = Job.model_validate(config)
173178
```
179+
This approach makes your code both robust and highly flexible, allowing you to build data-driven systems that are easy to configure and extend.
174180

175-
This pattern not only makes the instantiation of objects based on dynamic input straightforward but also ensures that each instantiated object is immediately validated against its respective schema, further enhancing the robustness of your application.
176-
177-
Incorporating these practices into your MLOps projects can significantly improve the reliability and maintainability of your code, ensuring that your machine learning pipelines are both efficient and error-resistant.
178-
179-
## Design pattern additional resources
181+
## Additional Resources
180182

181183
- **[Design pattern examples from the MLOps Python Package](https://github.com/fmind/mlops-python-package/tree/main/src/bikes)**
182184
- **[Stop Building Rigid AI/ML Pipelines: Embrace Reusable Components for Flexible MLOps](https://fmind.medium.com/stop-building-rigid-ai-ml-pipelines-embrace-reusable-components-for-flexible-mlops-6e165d837110)**

0 commit comments

Comments
 (0)