QuickSight New Data Preparation Experience: Import Fails for Iceberg Tables with Timestamp Columns (UNLOAD Query Error)

Hello all,

we are encountering an issue with the new datasets.

Description:

We have encountered a critical issue with the new Data Preparation Experience in Amazon QuickSight when working with Iceberg tables that contain timestamp columns.

Problem Overview:
When refreshing QuickSight datasets based on Iceberg tables with timestamp columns, the refresh fails. This happens with datasets created using the new Data Preparation Experience (Data Preparation Experience (New) - Amazon Quick Suite). Legacy datasets (created via API or Terraform) do not have this problem. The problem is the underlying Athena query that is different from the one that a legacy datasets perform during a refresh.

Example of Failing Query:

UNLOAD (/* QuickSight ... */
SELECT"tenant_id", "yard_document_entity_id", "yard_document_source_id", "status",
"date",
"changed_by",
"calculated_at"
FROM"AwsDataCatalog"."testing_data_lake_api"."yard_document_monitoring_process_history")
TO 's3://aws-athena-query-results-.../'
WITH (format = 'PARQUET', compression = 'ZSTD')

The error can be worked around by explicitly casting the timestamp columns:

UNLOAD (/* QuickSight ... */
SELECT"tenant_id", "yard_document_entity_id", "yard_document_source_id", "status",
cast("date"as timestamp) AS"date",
"changed_by",
cast("calculated_at"as timestamp) AS "calculated_at"
FROM"AwsDataCatalog"."testing_data_lake_api"."yard_document_monitoring_process_history")
TO 's3://aws-athena-query-results-.../'
WITH (format = 'PARQUET', compression = 'ZSTD')

Background:

Our Situation:

  • At the moment it is completely unclear to us how and when the migration to the new datasets will happen and if we will be forced to migrate.

  • We are concerned that, once the migration occurs, our system will break due to this bug with timestamp columns in Iceberg tables. In the end every dataset of ours contains timestamp columns.

Questions:

  1. Is this bug with timestamp columns in Iceberg tables on the QuickSight team’s radar?

  2. Will this issue be fixed before the migration to the new dataset format is enforced?

  3. Is there an official workaround or best practice to avoid this problem, aside from manually casting all timestamp columns? I guess we would have to create views for every table casting every timestamp to a timestamp which would be wild.

Any information or guidance would be greatly appreciated, as this issue could potentially cause system outages once a migration is triggered without this issue being resolved.

Thank you!

@myleodsc.hilleckes Thank you for flagging this issue. We will look into the timestamp error and provide a response soon.

Meanwhile, can you try creating the same dataset in the legacy experience and seeing if it works there?

Regarding your question about migration, we would like to clarify that both the experiences will exist side-by-side as we continue to add unsupported features in the new experience. Full migration to the new experience is not planned in the near future. This will only be initiated after all the legacy use cases have been addressed.

Our objective is to provide the flexibility by having both the experiences. This enables users to explore the benefits from the new experience while simultaneously switching back to legacy if there is a use case involving an unsupported feature.

Yes. I already mentioned that legacy datasets do not have this issue.

@myleodsc.hilleckes Thanks for clarifying regarding legacy datasets. We require additional details (like dataset id, request id) to troubleshoot further. Could you create an AWS support ticket and include the details?