Skip to content

Combine multiple approx distinct on the same type #26694

@kaikalur

Description

@kaikalur

Approx distinct uses hll and in some tool generated codes with 100s of approx_distinct, it requires a lot of memory. So optionally when the agg is on a low-cardinality field - like dates or other strings - we combine them into a single set_agg.

So:

select approx_distinct(if(y > 10, x)) a1, approx_distinct(if(y > 20, z)) a2 from (values (1,20,4), (2,30,4)) as t(x,y,z);

Can be combined into a single expression:

select cardinality(array_distinct(remove_nulls(a[1]))) a1, cardinality(array_distinct(remove_nulls(a[2]))) a2 from (select array_transpose(a) a from (select set_agg(array[if(y > 10, x), if(y > 20, z)]) a from (values (1,20,4), (2,30,4)) as t(x,y,z)));

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    🆕 Unprioritized

    Status

    🆕 Unprioritized

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions