Skip to content

[Bug] truncate() partition transformation does not work when it includes more than 100 partitions #429

@alex-antonison

Description

@alex-antonison

Is this a new bug in dbt-athena?

  • I believe this is a new bug in dbt-athena
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When you use a truncate() partition transformation for a column that will result in more than 100 partitions, the batch partitioning functionality starts up and allows you to exceed 100 partitions.

{{
    config(
        materialized = 'table',
        table_type = 'iceberg',
        force_batch = true,
        partitioned_by = ['truncate(string_partition,2)', 'month(date_partition)']
    )
}}

However, when the query reaches out to Athena to pull in the distinct partitions, it uses truncate() in the query which is not a supported method of extracting values from a string in Athena.

select distinct truncate(string_partition,2), date_trunc('month', date_partition)
from "awsdatacatalog"."data_lake"."table__ha__tmp_not_partitioned"
order by truncate(string_partition,2), date_trunc('month', date_partition)

Instead, it could use something like substring() to pull back the unique partial values

select distinct substring(string_partition,1,2), date_trunc('month', date_partition)
from "awsdatacatalog"."data_lake"."table__ha__tmp_not_partitioned"
order by substring(string_partition,1,2), date_trunc('month', date_partition)

Expected Behavior

When I do a truncate() Iceberg partition transformation on a column, it is capable of handling something with greater than 100 partitions.

Steps To Reproduce

Create a model with a column that when a partition transformation of truncate() is used, it will result in more than 100 partitions.

Environment

- OS: MacOS
- Python: 3.11
- dbt: 1.7.7
- dbt-athena-community: 1.7.1

Additional Context

This is out of a Slack conversation: https://getdbt.slack.com/archives/C013MLFR7BQ/p1709755667814619

This method was referenced as where the work would need to be changed: https://github.com/dbt-athena/dbt-athena/blob/289be4f4f44f3d5a6cf575d8fe218209c4a41171/dbt/adapters/athena/impl.py#L1279

Apache Iceberg Truncate Partition documentation: https://iceberg.apache.org/spec/#truncate-transform-details

Metadata

Metadata

Assignees

No one assigned

    Labels

    pkg:dbt-athenaIssue affects dbt-athenatype:bugSomething isn't working as documented

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions