Skip to content

[BUG] Unmatched values in comparing q78 if problematic_val is 0.0 #209

@yinqingh

Description

@yinqingh

Describe the bug
In comparing q78 problematic values with SF30K datasets, I observed unmatched values error in the results but two rows are identical

[2025-04-01T10:01:51.743Z] Collected 100 rows in 2.24267315864563 seconds
[2025-04-01T10:01:51.743Z] Row 96: 
[2025-04-01T10:01:51.743Z] [7900, 0.0, 44, Decimal('99.00'), Decimal('15.92'), 19788, Decimal('16291.44'), Decimal('19716.60')]
[2025-04-01T10:01:51.743Z] [7900, 0.0, 44, Decimal('99.00'), Decimal('15.92'), 19788, Decimal('16291.44'), Decimal('19716.60')]
[2025-04-01T10:01:51.743Z] 
[2025-04-01T10:01:51.743Z] Processed 100 rows
[2025-04-01T10:01:51.743Z] There were 1 errors
[2025-04-01T10:01:51.743Z] === Unmatch Queries: ['query78'] ===

Based on the q78 compare function, the problematic_val_eq will be false if problematic_val_row1 and problematic_val_row2 is 0.0, which is unexpected. This should be the root cause of this issue.

# this value could be none in some rows
if all([problematic_val_row1, problematic_val_row2]):
    # this value is rounded to its pencentile: round(ss_qty/(coalesce(ws_qty,0)+coalesce(cs_qty,0)),2)
    # so we allow the diff <= 0.01 + default epsilon 0.00001
    problematic_val_eq = abs(problematic_val_row1 - problematic_val_row2) <= 0.01001
elif problematic_val_row1 == None and problematic_val_row2 == None:
    problematic_val_eq = True
else:
    problematic_val_eq = False
return problematic_val_eq and all([compare(lhs, rhs, epsilon) for lhs, rhs in zip(row1, row2)])

Debug log:

[2025-04-02T02:35:20.693Z] problematic_val_row1: 0.0
[2025-04-02T02:35:20.693Z] problematic_val_row2: 0.0
[2025-04-02T02:35:20.693Z] problematic_val_eq: False
[2025-04-02T02:35:20.693Z] 7900 7900 True
[2025-04-02T02:35:20.693Z] 44 44 True
[2025-04-02T02:35:20.693Z] 99.00 99.00 True
[2025-04-02T02:35:20.693Z] 15.92 15.92 True
[2025-04-02T02:35:20.693Z] 19788 19788 True
[2025-04-02T02:35:20.693Z] 16291.44 16291.44 True
[2025-04-02T02:35:20.693Z] 19716.60 19716.60 True
[2025-04-02T02:35:20.693Z] True
[2025-04-02T02:35:20.693Z] epsilon: 1e-05
[2025-04-02T02:35:20.693Z] is_q78: True
[2025-04-02T02:35:20.693Z] q78_problematic_col: 2
[2025-04-02T02:35:20.693Z] Row 96: 
[2025-04-02T02:35:20.693Z] [7900, 0.0, 44, Decimal('99.00'), Decimal('15.92'), 19788, Decimal('16291.44'), Decimal('19716.60')]
[2025-04-02T02:35:20.693Z] [7900, 0.0, 44, Decimal('99.00'), Decimal('15.92'), 19788, Decimal('16291.44'), Decimal('19716.60')]

Steps/Code to reproduce bug
Please provide a list of steps or a code sample to reproduce the issue.
Avoid posting private or sensitive data.

Expected behavior
A clear and concise description of what you expected to happen.

Environment details (please complete the following information)

  • Environment
  • Configuration settings related to the issue

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions