Skip to content

BUG: Missing sigmoid in GroundingSAM2 postprocess & proposed solution #1179

@tiucamaria

Description

@tiucamaria

Search before asking

  • I have searched the X-AnyLabeling issues and found no similar bug report.

X-AnyLabeling Component

No response

Bug

Hello!
I was using your project to detect and segment cars in the UAVID dataset using GroundingSAM2, but I wasn't satisfied with the result, so I decided to investigate further. I tried one of your old examples in the README file so I could compare the outputs, but they weren't the same.
Right now, if you run the GroundingSAM2 model on this image, the output will be:

Image Image

Next, I checked which objects were being detected, and here I found the problem: the exact DINO model used in GorundingSAM2 is the same as the one that produces these results:

Image Image

Then, I tested it again on one of my UAVID images, where I knew some objects should be detected. I printed the scores and noticed that the predictions returned values of 1.x instead of being in the range (0, 1). In the GroundingSAM2 file is omitted to apply the sigmoid (sig function). Here is how I modified the code:

def postprocess(
self, outputs, caption, with_logits=True, token_spans=None
):
logits, boxes = outputs

logits_filt = np.squeeze(

logits, 0

) # [0] # prediction_logits.shape = (nq, 256)

logits_filt = self.sig(np.squeeze(logits, 0)) # BUG SOLVED
......

See the results after adding the call of the sig function:

Image Image

Information

Application Information:
{'App name': 'X-AnyLabeling', 'App version': '3.0.3', 'Device': 'GPU'}

System Information:
{'CPU': 'Intel64 Family 6 Model 186 Stepping 2, GenuineIntel',
'CUDA': None,
'GPU': '0, NVIDIA GeForce RTX 4060 Laptop GPU, 8188',
'Operating System': 'Windows-10-10.0.19045-SP0',
'Python Version': '3.10.18'}

Package Information:
{'ONNX Runtime GPU Version': '1.22.0',
'ONNX Runtime Version': None,
'ONNX Version': '1.16.1',
'OpenCV Contrib Python Headless Version': '4.11.0.86',
'PyQt5 Version': '5.15.7'}

Link to a Reproducible Demonstration Video

https://drive.google.com/file/d/1ibhkNuyp0-mLEEtvInq59cDcQl4HJtP-/view?usp=sharing

Execution Mode

Source Code

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingfixed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions