ADD NEW MASK not working properly in the Segment Anything 2 video mode

Mask prompts are not working in the ONNX pipeline, while click and box prompts work fine.

In the PyTorch model, sam_prompt_encoder.mask_input_size is defined as (256,256), but in the exported ONNX model the prompt encoder expects only 3D mask input ([B,H,W]) but the existing code checks for 4 dimensions which breaks the execution when prompt encoder is executed. As a result, the dense mask branch does not produce valid outputs (all values collapse to -1024).

It seems the mask path in the prompt encoder was not fully preserved during ONNX export, so only sparse prompts function correctly.

This is the debug output : 
begin image encoder onnx
0
(1, 3, 1024, 1024)
begin prompt encoder onnx
begin mask decoder onnx
backbone_features 11108.974
image_pe 34842.402
sparse_embeddings -1.4240006
dense_embeddings -18153.611
high_res_features 28492.957
high_res_features 25796.06



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ADD NEW MASK not working properly in the Segment Anything 2 video mode #1715

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ADD NEW MASK not working properly in the Segment Anything 2 video mode #1715

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions