You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: demos/image_generation/README.md
+21-30Lines changed: 21 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,11 +3,11 @@
3
3
This demo shows how to deploy image generation models (Stable Diffusion/Stable Diffusion 3/Stable Diffusion XL/FLUX) in the OpenVINO Model Server.
4
4
Image generation pipeline is exposed via [OpenAI API](https://platform.openai.com/docs/api-reference/images/create)`images/generations` endpoints.
5
5
6
-
> **Note:** This demo was tested on Intel® Xeon®, Intel® Core®, Intel® Arc™ A770 on Ubuntu 22/24, RedHat 9 and Windows 11.
6
+
> **Note:** This demo was tested on Intel® Xeon®, Intel® Core®, Intel® Arc™ A770, Intel® Arc™ B580 on Ubuntu 22/24, RedHat 9 and Windows 11.
7
7
8
8
## Prerequisites
9
9
10
-
**RAM/vRAM**Model used in this demo takes up to 7GB of RAM/vRAM. Please consider lower precision to decrease it, or better/bigger model to get better image results.
10
+
**RAM/vRAM**Select model size and precision according to your hardware capabilities (RAM/vRAM). Request resolution plays significant role in memory consumption, so the higher resolution you request, the more RAM/vRAM is required.
11
11
12
12
**Model preparation** (one of the below):
13
13
- preconfigured models from HuggingFaces directly in OpenVINO IR format, list of Intel uploaded models available [here](https://huggingface.co/collections/OpenVINO/image-generation-67697d9952fb1eee4a252aa8))
@@ -56,8 +56,8 @@ Assuming you have unpacked model server package, make sure to:
56
56
as mentioned in [deployment guide](../../docs/deploying_server_baremetal.md), in every new shell that will start OpenVINO Model Server.
Depending on how you prepared models in the first step of this demo, they are deployed to either CPU or GPU (it's defined in `config.json`). If you run on GPU make sure to have appropriate drivers installed, so the device is accessible for the model server.
97
97
98
-
```console
99
-
mkdir -p models
98
+
```bat
99
+
mkdir models
100
100
101
-
ovms.exe --rest_port 8000 ^
101
+
ovms --rest_port 8000 ^
102
102
--model_repository_path ./models/ ^
103
103
--task image_generation ^
104
104
--source_model OpenVINO/FLUX.1-schnell-int4-ov ^
@@ -131,7 +131,7 @@ Run `export_model.py` script to download and quantize the model:
Depending on how you prepared models in the first step of this demo, they are deployed to either CPU or GPU (it's defined in `config.json`). If you run on GPU make sure to have appropriate drivers installed, so the device is accessible for the model server.
Image Generation pipeline consists of one MediaPipe node - Image Generation Calculator. To serve the image generation model, it is required to create a MediaPipe graph configuration file that defines the node and its parameters. The graph configuration file is typically named `graph.pbtxt` and is placed in the model directory.
5
+
The `graph.pbtxt` file may be created automatically by the Model Server when [using HuggingFaces pulling](../pull_hf_models.md) on start-up, automatically via [export models script](../../demos/common/export_models/) or manually by an administrator.
6
+
7
+
Calculator has access to HTTP request and parses it to extract the generation parameters:
The input JSON content should be compatible with the [Image Generation API](../model_server_rest_api_image_generation.md).
20
+
21
+
The input also includes a side packet with a reference to `IMAGE_GEN_NODE_RESOURCES` which is a shared object representing multiple OpenVINO GenAI pipelines built from OpenVINO models loaded into memory just once.
22
+
23
+
**Every node based on Image Generation Calculator MUST have exactly that specification of this side packet:**
**If it is missing or modified, model server will fail to provide graph with the model**
28
+
29
+
The calculator produces `std::string` MediaPipe packet with the JSON content representing OpenAI response format, [described in separate document](../model_server_rest_api_image_generation.md). Image Generation calculator has no support for streaming and partial responses.
30
+
31
+
Let's have a look at the example graph definition:
Above node configuration should be used as a template since user is not expected to change most of it's content. Actually only `node_options` requires user attention as it specifies OpenVINO GenAI pipeline parameters. The rest of the configuration can remain unchanged.
52
+
53
+
The calculator supports the following `node_options` for tuning the pipeline configuration:
54
+
-`required string models_path` - location of the models and scheduler directory (can be relative);
55
+
-`optional string device` - device to load models to. Supported values: "CPU", "GPU", "NPU" [default = "CPU"]
56
+
-`optional string plugin_config` - [OpenVINO device plugin configuration](https://docs.openvino.ai/2025/openvino-workflow/running-inference/inference-devices-and-modes.html) and additional pipeline options. Should be provided in the same format for regular [models configuration](../parameters.md#model-configuration-options). The config is used for all models in the pipeline except for tokenizers (text encoders/decoders, unet, vae) [default = "{}"]
57
+
-`optional string max_resolution` - maximum resolution allowed for generation. Requests exceeding this value will be rejected. [default = "4096x4096"];
58
+
-`optional string default_resolution` - default resolution used for generation. If not specified, underlying model shape will determine final resolution.
59
+
-`optional uint64 max_num_images_per_prompt` - maximum number of images generated per prompt. Requests exceeding this value will be rejected. [default = 10];
60
+
-`optional uint64 default_num_inference_steps` - default number of inference steps used for generation, if not specified by the request [default = 50];
61
+
-`optional uint64 max_num_inference_steps` - maximum number of inference steps allowed for generation. Requests exceeding this value will be rejected. [default = 100];
62
+
63
+
64
+
## Models Directory
65
+
66
+
In node configuration we set `models_path` indicating location of the directory with files loaded by LLM engine. It loads following files:
├── model_index.json <------------ - GenAI configuration file including pipeline type SD/SDXL/SD3/FLUX
119
+
├── README.md
120
+
├── safety_checker
121
+
│ ├── config.json
122
+
│ └── model.safetensors
123
+
├── scheduler
124
+
│ └── scheduler_config.json
125
+
├── text_encoder
126
+
│ ├── config.json
127
+
│ ├── openvino_model.bin
128
+
│ └── openvino_model.xml
129
+
├── tokenizer
130
+
│ ├── merges.txt
131
+
│ ├── openvino_detokenizer.bin
132
+
│ ├── openvino_detokenizer.xml
133
+
│ ├── openvino_tokenizer.bin
134
+
│ ├── openvino_tokenizer.xml
135
+
│ ├── special_tokens_map.json
136
+
│ ├── tokenizer_config.json
137
+
│ └── vocab.json
138
+
├── unet
139
+
│ ├── config.json
140
+
│ ├── openvino_model.bin
141
+
│ └── openvino_model.xml
142
+
├── vae_decoder
143
+
│ ├── config.json
144
+
│ ├── openvino_model.bin
145
+
│ └── openvino_model.xml
146
+
└── vae_encoder
147
+
├── config.json
148
+
├── openvino_model.bin
149
+
└── openvino_model.xml
150
+
151
+
```
152
+
153
+
-`graph.pbtxt` - MediaPipe graph configuration file defining the Image Generation Calculator node and its parameters.
154
+
-`model_index.json` - GenAI configuration file that describes the pipeline type (SD/SDXL/SD3/FLUX) and the models used in the pipeline.
155
+
-`scheduler/scheduler_config.json` - configuration file for the scheduler that manages the execution of the models in the pipeline.
156
+
-`text_encoder`, `tokenizer`, `unet`, `vae_encoder`, `vae_decoder` - directories containing the OpenVINO models and their configurations for the respective components of the image generation pipeline.
157
+
158
+
We recommend using [export script](../../demos/common/export_models/README.md) to prepare models directory structure for serving, or simply use [HuggingFace pulling](../pull_hf_models.md) to automatically download and convert models from Hugging Face Hub.
| model | ✅ | ✅ | string (required) | Name of the model to use. Name assigned to a MediaPipe graph configured to schedule generation using desired embedding model. **Note**: This can also be omitted to fall back to URI based routing. Read more on routing topic **TODO**|
41
41
| prompt | ✅ | ✅ | string (required) | A text description of the desired image(s). **TODO**: Length restrictions? Too short/too large? |
42
42
| size | ✅ | ✅ | string or null (default: auto) | The size of the generated images. Must be in WxH format, example: `1024x768`. Default model W/H will be used when using `auto`. |
43
-
| n | ❌ | ✅ | integer or null (default: `1`) | A number of images to generate. If you want to generate multiple images for the same combination of generation parameters and text prompts, you can use this parameter for better performance as internally compuations will be performed with batch for Unet / Transformer models and text embeddings tensors will also be computed only once. **Not supported for now.**|
43
+
| n | ❌ | ✅ | integer or null (default: `1`) | A number of images to generate. If you want to generate multiple images for the same combination of generation parameters and text prompts, you can use this parameter for better performance as internally computations will be performed with batch for Unet / Transformer models and text embeddings tensors will also be computed only once. **Not supported for now.**|
44
44
| background | ❌ | ✅ | string or null (default: auto) | Allows to set transparency for the background of the generated image(s). Not supported for now. |
45
45
| style | ❌ | ✅ | string or null (default: vivid) | The style of the generated images. Recognized OpenAI settings, but not supported: vivid, natural. |
46
46
| moderation | ❌ | ✅ | string (default: auto) | Control the content-moderation level for images generated by endpoint. Either `low` or `auto`. Not supported for now. |
0 commit comments