Skip to content

Commit ec26537

Browse files
author
Jan Iwaszkiewicz
authored
[PyOV][DOCS] Update inference documentation with shared memory flags (#18561)
1 parent d21296b commit ec26537

File tree

2 files changed

+17
-11
lines changed

2 files changed

+17
-11
lines changed

docs/OV_Runtime_UG/Python_API_inference.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -26,16 +26,17 @@ The ``CompiledModel`` class provides the ``__call__`` method that runs a single
2626
:fragment: [direct_inference]
2727

2828

29-
Shared Memory on Inputs
30-
#######################
29+
Shared Memory on Inputs and Outputs
30+
###################################
3131

3232
While using ``CompiledModel``, ``InferRequest`` and ``AsyncInferQueue``,
3333
OpenVINO™ Runtime Python API provides an additional mode - "Shared Memory".
34-
Specify the ``shared_memory`` flag to enable or disable this feature.
35-
The "Shared Memory" mode may be beneficial when inputs are large and copying
36-
data is considered an expensive operation. This feature creates shared ``Tensor``
34+
Specify the ``share_inputs`` and ``share_outputs`` flag to enable or disable this feature.
35+
The "Shared Memory" mode may be beneficial when inputs or outputs are large and copying data is considered an expensive operation.
36+
37+
This feature creates shared ``Tensor``
3738
instances with the "zero-copy" approach, reducing overhead of setting inputs
38-
to minimum. Example usage:
39+
to minimum. For outputs this feature creates numpy views on data. Example usage:
3940

4041

4142
.. doxygensnippet:: docs/snippets/ov_python_inference.py
@@ -45,13 +46,14 @@ to minimum. Example usage:
4546

4647
.. note::
4748

48-
"Shared Memory" is enabled by default in ``CompiledModel.__call__``.
49+
"Shared Memory" on inputs is enabled by default in ``CompiledModel.__call__``.
4950
For other methods, like ``InferRequest.infer`` or ``InferRequest.start_async``,
5051
it is required to set the flag to ``True`` manually.
52+
"Shared Memory" on outputs is disabled by default in all sequential inference methods (``CompiledModel.__call__`` and ``InferRequest.infer``). It is required to set the flag to ``True`` manually.
5153

5254
.. warning::
5355

54-
When data is being shared, all modifications may affect inputs of the inference!
56+
When data is being shared, all modifications (including subsequent inference calls) may affect inputs and outputs of the inference!
5557
Use this feature with caution, especially in multi-threaded/parallel code,
5658
where data can be modified outside of the function's control flow.
5759

docs/snippets/ov_python_inference.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,13 @@
3232
request = compiled_model.create_infer_request()
3333

3434
#! [shared_memory_inference]
35-
# Data can be shared
36-
_ = compiled_model({"input_0": data_0, "input_1": data_1}, shared_memory=True)
37-
_ = request.infer({"input_0": data_0, "input_1": data_1}, shared_memory=True)
35+
# Data can be shared only on inputs
36+
_ = compiled_model({"input_0": data_0, "input_1": data_1}, share_inputs=True)
37+
_ = request.infer({"input_0": data_0, "input_1": data_1}, share_inputs=True)
38+
# Data can be shared only on outputs
39+
_ = request.infer({"input_0": data_0, "input_1": data_1}, share_outputs=True)
40+
# Or both flags can be combined to achieve desired behavior
41+
_ = compiled_model({"input_0": data_0, "input_1": data_1}, share_inputs=False, share_outputs=True)
3842
#! [shared_memory_inference]
3943

4044
time_in_sec = 2.0

0 commit comments

Comments
 (0)