Skip to content

Commit b860988

Browse files
authored
Save much memory at model loading time by converting weights to OrtValues early (#26345)
### Description Converts weights early and revert "Properly remove in-memory references (#25652)" This reverts commit 3ca49d8 and makes appropriate adjustments for the current state of the code. This PR is made possible and on the heels of: #26263 #25833. Previous history: #23979 #25320 #25626 #25652 The first change (#26263) allows us to convert initializers to OrtValues early and save lots of memory at model loading time. Specifically, for Phi-4-mini-instruct-INT4 model before and after looks like this: **Before** <img width="1204" height="124" alt="Before change DEBUG 2025-10-16 144819" src="https://github.com/user-attachments/assets/674ff75b-057f-498a-a906-0140d59d46e6" /> **After** <img width="997" height="114" alt="After change DEBUG 2025-10-16 144819" src="https://github.com/user-attachments/assets/df1783af-7f50-4cd2-b3ad-6868f23be53f" /> The two peaks represent memory usage at optimization time (8.1Gb before) and after weights memory mapping (6.5Gb) After this change corresponding numbers look 3.5Gb and 4.7Gb respectively. Most of the savings during optimization phase come from `ConstantFolding` where we are able to reuse the resulting OrtValues directly for the new initializers. This PR concludes a series of PRs converting initializers to OrtValues. Memory consumption before the conversion began was 9.3Gb and 6.7Gb respectively. We are saving almost 6Gb during optimization and 2Gb for the steady state. <img width="1175" height="139" alt="image" src="https://github.com/user-attachments/assets/80e7d228-8a8e-4316-8e04-b02c2be30f04" /> The model also loads about 12 seconds faster. Example of ConstantFolding being one of the top contributors where we duplicate memory for higher peak before Resolve takes care of no longer used initializers. <img width="1100" height="558" alt="Sanpshot 3 Peak on ConstantFolding Transpose Optimizer" src="https://github.com/user-attachments/assets/95545abd-3f99-46d9-862e-bbf27cbb5b40" /> <img width="1060" height="600" alt="Snapshot 4 Peak AddInitializer from ConstantFolding" src="https://github.com/user-attachments/assets/dd457ec6-23ee-4efd-8c60-625d5faad61e" /> <img width="325" height="160" alt="image" src="https://github.com/user-attachments/assets/37c1194d-f683-49a7-afb1-073dfbb9bbfc" /> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Reduce memory usage.
1 parent 860d085 commit b860988

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+257
-291
lines changed

include/onnxruntime/core/graph/graph.h

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1454,13 +1454,6 @@ class Graph { // NOLINT(clang-analyzer-optin.performance.Padding): preserve exi
14541454
return Resolve(default_options);
14551455
}
14561456

1457-
/// <summary>
1458-
/// This function converts all the graph TensorProto initializers into OrtValues
1459-
/// and creates a in-memory external data reference for each OrtValue.
1460-
/// </summary>
1461-
/// <returns></returns>
1462-
Status ConvertInitializersIntoOrtValues();
1463-
14641457
/**
14651458
* @brief Converts a subset of graph TensorProto initializers into OrtValues and updates the graph proto.
14661459
*

onnxruntime/core/framework/session_state_utils.cc

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -140,12 +140,12 @@ static common::Status DeserializeTensorProto(const Env& env, const std::basic_st
140140
std::move(tensor), ort_value);
141141
}
142142
} else {
143-
// for internal initializer, always allocate memory on device - tensor
144-
ORT_RETURN_IF_ERROR(AllocateTensor(memory_buffer, tensor, type, tensor_shape,
145-
use_device_allocator_for_initializers, alloc));
146-
147143
if (device == default_cpu_device) {
148144
// deserialize directly to CPU tensor
145+
// Do not use arena for internal initializer, just like we do for OrtValue initializers
146+
ORT_RETURN_IF_ERROR(AllocateTensorOnDeviceOrMemory(/* use_device_allocator_for_initializers =*/true,
147+
tensor_shape, type,
148+
default_cpu_alloc, tensor));
149149
ORT_RETURN_IF_ERROR(utils::TensorProtoToTensor(env, proto_path.c_str(), tensor_proto, tensor));
150150
Tensor::InitOrtValue(std::move(tensor), ort_value);
151151
return common::Status::OK();
@@ -154,13 +154,19 @@ static common::Status DeserializeTensorProto(const Env& env, const std::basic_st
154154
return ORT_MAKE_STATUS(ONNXRUNTIME, FAIL, "string tensor is not supported for copying between allocators");
155155
}
156156

157+
// Allocate according to the plan on the device or directly on the device according to
158+
// use_device_allocator_for_initializers
159+
ORT_RETURN_IF_ERROR(AllocateTensor(memory_buffer, tensor, type, tensor_shape,
160+
use_device_allocator_for_initializers, alloc));
161+
157162
// deserialize to CPU first for non-CPU allocator, then copy
158163
// for internal initializer
159-
// 1. allocate memory on CPU - deserialized_tensor
160-
// 2. deserialize tensor_proto into a preallocated tensor (deserialized_tensor)
164+
// 1. allocate memory on CPU - deserialized_tensor. Do not use arena not to waste space for temporary buffers.
165+
// 2. deserialize tensor_proto into a pre-allocated tensor (deserialized_tensor)
161166
// 3. copy tensor from CPU to device - deserialized_tensor -> tensor (allocated above) -> ort_value
162167
Tensor deserialized_tensor;
163-
ORT_RETURN_IF_ERROR(AllocateTensorOnDeviceOrMemory(use_device_allocator_for_initializers, tensor_shape, type,
168+
ORT_RETURN_IF_ERROR(AllocateTensorOnDeviceOrMemory(/* use_device_allocator_for_initializers =*/true,
169+
tensor_shape, type,
164170
default_cpu_alloc, deserialized_tensor));
165171

166172
ORT_RETURN_IF_ERROR(utils::TensorProtoToTensor(env, proto_path.c_str(), tensor_proto, deserialized_tensor));
@@ -346,6 +352,13 @@ common::Status SaveInitializedTensors(
346352
<< i.second << " bytes for " << i.first.ToString() << std::endl;
347353
}
348354

355+
// ??? Should we ignore this session option if the EP is explicitly providing the read only allocator?
356+
// bool have_readonly_initializer_allocator = alloc->Info().alloc_type == OrtReadOnlyAllocator;
357+
// This option also means to ignore arena if present and use Reserve().
358+
const bool use_device_allocator_for_initializers =
359+
session_options.config_options.GetConfigOrDefault(
360+
kOrtSessionOptionsUseDeviceAllocatorForInitializers, "0") == "1";
361+
349362
// 3. create weight tensors based on weights buffer
350363
for (const auto& entry : id_to_initialized_tensor) {
351364
// We check for cancellation for every initializer since mapping from disk can be costly
@@ -375,12 +388,6 @@ common::Status SaveInitializedTensors(
375388
// TODO: if the tensor need be copied, does it have enough room?
376389
ORT_RETURN_IF_ERROR(planner.GetPreallocatedBuffer(ort_value_index, name, memory_buffer, alloc));
377390

378-
// ??? Should we ignore this session option if the EP is explicitly providing the read only allocator?
379-
// bool have_readonly_initializer_allocator = alloc->Info().alloc_type == OrtReadOnlyAllocator;
380-
const bool use_device_allocator_for_initializers =
381-
session_options.config_options.GetConfigOrDefault(
382-
kOrtSessionOptionsUseDeviceAllocatorForInitializers, "0") == "1";
383-
384391
// Check if we already have an OrtValue for this initializer on CPU
385392
if (OrtValue ort_value_from_graph;
386393
graph.GetOrtValueInitializer(name, ort_value_from_graph)) {

onnxruntime/core/framework/tensor.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,14 +93,14 @@ Tensor::Tensor(MLDataType elt_type, const TensorShape& shape, std::shared_ptr<IA
9393
if (len > 0) {
9494
p_data = allocator->Alloc(len);
9595
}
96-
Init(elt_type, shape, p_data, allocator, 0L);
96+
Init(elt_type, shape, p_data, std::move(allocator), 0L);
9797
}
9898

9999
Tensor::Tensor(MLDataType elt_type, const TensorShape& shape, void* p_data, std::shared_ptr<IAllocator> deleter,
100100
ptrdiff_t offset, gsl::span<const int64_t> strides)
101101
: alloc_info_(deleter->Info()) {
102102
ORT_ENFORCE(elt_type != nullptr);
103-
Init(elt_type, shape, p_data, deleter, offset, strides);
103+
Init(elt_type, shape, p_data, std::move(deleter), offset, strides);
104104
}
105105

106106
void Tensor::InitOrtValue(MLDataType elt_type, const TensorShape& shape, std::shared_ptr<IAllocator> allocator,

onnxruntime/core/graph/graph.cc

Lines changed: 43 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1231,6 +1231,28 @@ Graph::Graph(const Model& owning_model,
12311231
ArgNameToTypeMap name_to_type_map;
12321232
const auto& model_path = ModelPath();
12331233

1234+
// If the tensor proto data is large enough, move data from TensorProto to an OrtValue
1235+
// - Add external data reference to TensorProto that points to an OrtValue.
1236+
// This lambda should not be used on initializers that already have external data reference.
1237+
// Otherwise, this function does nothing.
1238+
auto put_large_tensor_in_ort_value = [this, &model_path](ONNX_NAMESPACE::TensorProto& tensor_proto) {
1239+
size_t size_in_bytes = 0;
1240+
ORT_THROW_IF_ERROR(utils::GetSizeInBytesFromTensorProto<0>(tensor_proto, &size_in_bytes));
1241+
if (size_in_bytes > utils::kSmallTensorExternalDataThreshold) {
1242+
OrtValue ort_value;
1243+
ORT_THROW_IF_ERROR(utils::TensorProtoToOrtValue(Env::Default(), model_path, tensor_proto,
1244+
CPUAllocator::DefaultInstance(), ort_value));
1245+
constexpr const bool use_tensor_buffer_true = true;
1246+
auto tensor_proto_to_add = utils::TensorToTensorProto(ort_value.Get<Tensor>(), tensor_proto.name(),
1247+
use_tensor_buffer_true);
1248+
assert(ort_value.IsAllocated());
1249+
auto ins_result = ortvalue_initializers_.insert_or_assign(tensor_proto_to_add.name(), std::move(ort_value));
1250+
ORT_ENFORCE(ins_result.second, "Unexpected duplicate insert or assign OrtValue for tensor: ", tensor_proto_to_add.name(),
1251+
" in the initializer list.");
1252+
tensor_proto = std::move(tensor_proto_to_add);
1253+
}
1254+
};
1255+
12341256
// Process 'Constant' nodes
12351257
// Put the 'TensorProto' stored in the 'Constant' nodes attribute into the graphs initializer list
12361258
for (auto& node : graph_proto_->node()) {
@@ -1250,6 +1272,8 @@ Graph::Graph(const Model& owning_model,
12501272
}
12511273
}
12521274

1275+
put_large_tensor_in_ort_value(*tensor);
1276+
12531277
// Ensure initializers are also graph inputs.
12541278
if (ir_version_ < 4) {
12551279
TypeProto t{utils::TypeProtoFromTensorProto(*tensor)};
@@ -1326,7 +1350,25 @@ Graph::Graph(const Model& owning_model,
13261350
}
13271351

13281352
// Copy initial tensors to a map.
1329-
for (auto& tensor : graph_proto_->initializer()) {
1353+
for (int i = 0, lim = graph_proto_->initializer_size(); i < lim; ++i) {
1354+
auto& tensor = *graph_proto_->mutable_initializer(i);
1355+
// If data is on disk, it will be loaded either by optimizers
1356+
// or during session state finalization.
1357+
// If data is already in memory, do nothing.
1358+
if (!utils::HasExternalData(tensor)) {
1359+
// sparse_tensor_names_ contain references to strings to save memory
1360+
// in case we replace the tensor_proto, we want to make sure we remove
1361+
// the old reference first, and then add a new one.
1362+
const bool is_sparse = sparse_tensor_names_.count(tensor.name());
1363+
if (is_sparse) {
1364+
sparse_tensor_names_.erase(tensor.name());
1365+
}
1366+
put_large_tensor_in_ort_value(tensor);
1367+
if (is_sparse) {
1368+
sparse_tensor_names_.emplace(tensor.name());
1369+
}
1370+
}
1371+
13301372
auto p = name_to_initial_tensor_.emplace(tensor.name(), &tensor);
13311373
if (!p.second) {
13321374
LOGS(logger_, WARNING) << "Duplicate initializer (dense, sparse or ConstantNode): '" << tensor.name()
@@ -3415,38 +3457,6 @@ Status Graph::Resolve(const ResolveOptions& options) {
34153457
return ForThisAndAllSubgraphs(all_subgraphs, finalize_func);
34163458
}
34173459

3418-
Status Graph::ConvertInitializersIntoOrtValues() {
3419-
std::vector<Graph*> all_subgraphs;
3420-
FindAllSubgraphs(all_subgraphs);
3421-
3422-
auto put_weights_maybe_in_memory_func = [&](Graph& graph) -> Status {
3423-
// if we have any initializers that are not in memory, put them there.
3424-
const auto& model_path = graph.ModelPath();
3425-
auto& graph_proto = *graph.graph_proto_;
3426-
for (int i = 0, lim = graph_proto.initializer_size(); i < lim; ++i) {
3427-
auto& tensor_proto = *graph_proto.mutable_initializer(i);
3428-
if (utils::HasExternalData(tensor_proto)) {
3429-
continue; // ignore data on disk, that will be loaded either by EP or at session_state finalize
3430-
}
3431-
3432-
size_t size_in_bytes = 0;
3433-
ORT_RETURN_IF_ERROR(utils::GetSizeInBytesFromTensorProto<0>(tensor_proto, &size_in_bytes));
3434-
if (size_in_bytes > utils::kSmallTensorExternalDataThreshold) {
3435-
OrtValue ort_value;
3436-
ORT_RETURN_IF_ERROR(utils::TensorProtoToOrtValue(Env::Default(), model_path, tensor_proto,
3437-
CPUAllocator::DefaultInstance(), ort_value));
3438-
constexpr const bool use_tensor_buffer_true = true;
3439-
auto tensor_proto_to_add = utils::TensorToTensorProto(ort_value.Get<Tensor>(), tensor_proto.name(),
3440-
use_tensor_buffer_true);
3441-
ORT_RETURN_IF_ERROR(graph.ReplaceInitializedTensor(tensor_proto_to_add, ort_value));
3442-
}
3443-
}
3444-
return Status::OK();
3445-
};
3446-
3447-
return ForThisAndAllSubgraphs(all_subgraphs, put_weights_maybe_in_memory_func);
3448-
}
3449-
34503460
void Graph::SetName(const std::string& name) {
34513461
graph_proto_->set_name(name);
34523462
}

onnxruntime/core/graph/graph_utils.cc

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -285,19 +285,19 @@ NodeArg& AddInitializer(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_ini
285285
return GetOrCreateNodeArg(graph, new_initializer);
286286
}
287287

288-
NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer) {
288+
NodeArg& AddInitializerWithOrtValue(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer) {
289289
const bool has_external_data = utils::HasExternalData(new_initializer);
290290
ORT_ENFORCE(!has_external_data, "Expecting an initializer that contains data inline");
291291

292292
Tensor tensor;
293293
ORT_THROW_IF_ERROR(utils::CreateTensorFromTensorProto(Env::Default(), graph.ModelPath(),
294294
new_initializer, tensor));
295295
auto tensor_proto_with_ptr = utils::TensorToTensorProto(tensor, new_initializer.name(), true);
296-
return AddInitializerWithExternalData(graph, tensor_proto_with_ptr, std::move(tensor));
296+
return AddInitializerWithOrtValue(graph, tensor_proto_with_ptr, std::move(tensor));
297297
}
298298

299-
NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer,
300-
Tensor&& tensor) {
299+
NodeArg& AddInitializerWithOrtValue(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer,
300+
Tensor&& tensor) {
301301
OrtValue ort_value;
302302
if (utils::HasExternalDataInMemory(new_initializer)) {
303303
Tensor::InitOrtValue(std::move(tensor), ort_value);
@@ -307,8 +307,8 @@ NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::Tens
307307
return GetOrCreateNodeArg(graph, new_initializer);
308308
}
309309

310-
NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer,
311-
OrtValue ort_value) {
310+
NodeArg& AddInitializerWithOrtValue(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer,
311+
OrtValue ort_value) {
312312
ORT_THROW_IF_ERROR(graph.AddInitializedOrtValue(new_initializer, ort_value));
313313
return GetOrCreateNodeArg(graph, new_initializer);
314314
}

onnxruntime/core/graph/graph_utils.h

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,8 @@ NodeArg& AddInitializer(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_ini
4545
/// <param name="new_initializer">TensorProto with external data contained in ort_value</param>
4646
/// <param name="ort_value">ort_value with data</param>
4747
/// <returns></returns>
48-
NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer,
49-
OrtValue ort_value);
48+
NodeArg& AddInitializerWithOrtValue(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer,
49+
OrtValue ort_value);
5050

5151
/** Add a new initializer to 'graph'.
5252
* Checks that new_initializer does not already exist in 'graph' before adding it.
@@ -55,7 +55,7 @@ NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::Tens
5555
* @returns The NodeArg for the new initializer.
5656
* @remarks No matching graph input is created, so the initializer will be constant.
5757
*/
58-
NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer, Tensor&& tensor);
58+
NodeArg& AddInitializerWithOrtValue(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer, Tensor&& tensor);
5959

6060
/** Add a new initializer to 'graph'.
6161
* The function unpacks data into a tensor and converts new_initializer to a TensorProto with external data in memory.
@@ -67,7 +67,7 @@ NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::Tens
6767
* @returns The NodeArg for the new initializer.
6868
* @remarks No matching graph input is created, so the initializer will be constant.
6969
*/
70-
NodeArg& AddInitializerWithExternalData(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer);
70+
NodeArg& AddInitializerWithOrtValue(Graph& graph, const ONNX_NAMESPACE::TensorProto& new_initializer);
7171

7272
/// <summary>
7373
/// If the initializer with the given name does not exist in the destination graph, but exists in the

onnxruntime/core/optimizer/attention_fusion.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ static NodeArg& MergeQkvWeights(Graph& graph, int64_t hidden_size,
111111
utils::SetRawDataInTensorProto(initializer, result.data(), gsl::narrow<size_t>(element_count) * sizeof(MLFloat16));
112112
}
113113

114-
return graph_utils::AddInitializer(graph, initializer);
114+
return graph_utils::AddInitializerWithOrtValue(graph, initializer);
115115
}
116116

117117
static NodeArg* ConvertMaskToInt32(Graph& graph, NodeArg* mask_input, ProviderType provider_type,

onnxruntime/core/optimizer/compute_optimizer/shared_utils.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ NodeArg* CreateInitializerFromVector(Graph& graph,
189189
"total_count: ", total_count, " values.size(): ", values.size());
190190

191191
utils::SetRawDataInTensorProto(const_tensor, values.data(), values.size() * sizeof(int64_t));
192-
return &graph_utils::AddInitializer(graph, const_tensor);
192+
return &graph_utils::AddInitializerWithOrtValue(graph, const_tensor);
193193
}
194194

195195
NodeArg* InsertNodesForValidIndices(Graph& graph,

onnxruntime/core/optimizer/constant_folding.cc

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ static bool ConstantFoldShapeNode(Graph& graph, Node& node) {
9595
ONNX_NAMESPACE::TensorShapeProto result_shape;
9696
result_shape.add_dim()->set_dim_value(clamped_slice_length);
9797
constant_arg_out->SetShape(result_shape);
98-
graph_utils::AddInitializer(graph, shape_constant);
98+
graph_utils::AddInitializerWithOrtValue(graph, shape_constant);
9999
}
100100

101101
return is_concrete_shape; // convert to constant if this is true
@@ -317,19 +317,24 @@ Status ConstantFolding::ApplyImpl(Graph& graph, bool& modified, int graph_level,
317317
// Build the TensorProto that corresponds to the computed OrtValue and add it as initializer to the graph.
318318
auto* constant_arg_out = node->MutableOutputDefs()[fetch_idx];
319319
const Tensor& out_tensor = ort_value.Get<Tensor>();
320-
constexpr const bool use_tensor_buffer_false = false;
320+
constexpr const bool use_tensor_buffer_true = true;
321321
ONNX_NAMESPACE::TensorProto out_tensorproto = utils::TensorToTensorProto(
322322
out_tensor,
323323
constant_arg_out->Name(),
324-
use_tensor_buffer_false);
324+
use_tensor_buffer_true);
325325

326326
ONNX_NAMESPACE::TensorShapeProto result_shape;
327327
for (auto& dim : out_tensor.Shape().GetDims()) {
328328
result_shape.add_dim()->set_dim_value(dim);
329329
}
330330

331331
constant_arg_out->SetShape(result_shape);
332-
graph.AddInitializedTensor(out_tensorproto);
332+
// The data is too small and has been inlined.
333+
if (!utils::HasExternalData(out_tensorproto)) {
334+
ORT_THROW_IF_ERROR(graph.AddInitializedOrtValue(out_tensorproto, OrtValue()));
335+
} else {
336+
ORT_THROW_IF_ERROR(graph.AddInitializedOrtValue(out_tensorproto, ort_value));
337+
}
333338
}
334339
}
335340
}

onnxruntime/core/optimizer/conv_add_fusion.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ Status ConvAddFusion::Apply(Graph& graph, Node& node, RewriteRuleEffect& modifie
7979
auto new_name = graph.GenerateNodeArgName("ConvAddFusion_B_" + B_input_name);
8080
new_conv_B_tensor_proto.set_name(new_name);
8181

82-
NodeArg& new_conv_B_node_arg = graph_utils::AddInitializer(graph, new_conv_B_tensor_proto);
82+
NodeArg& new_conv_B_node_arg = graph_utils::AddInitializerWithOrtValue(graph, new_conv_B_tensor_proto);
8383
graph_utils::ReplaceNodeInput(node, 2, new_conv_B_node_arg);
8484

8585
} else {
@@ -94,7 +94,7 @@ Status ConvAddFusion::Apply(Graph& graph, Node& node, RewriteRuleEffect& modifie
9494
auto new_name = graph.GenerateNodeArgName("ConvAddFusion_Add_B_" + add_B_tensor_proto->name());
9595
new_conv_B_tensor_proto.set_name(new_name);
9696

97-
NodeArg& new_add_B_node_arg = graph_utils::AddInitializer(graph, new_conv_B_tensor_proto);
97+
NodeArg& new_add_B_node_arg = graph_utils::AddInitializerWithOrtValue(graph, new_conv_B_tensor_proto);
9898
graph_utils::AddNodeInput(node, 2, new_add_B_node_arg);
9999
}
100100

0 commit comments

Comments
 (0)