Hello all,
we are encountering an issue with the new datasets.
Description:
We have encountered a critical issue with the new Data Preparation Experience in Amazon QuickSight when working with Iceberg tables that contain timestamp columns.
Problem Overview:
When refreshing QuickSight datasets based on Iceberg tables with timestamp columns, the refresh fails. This happens with datasets created using the new Data Preparation Experience (Data Preparation Experience (New) - Amazon Quick Suite). Legacy datasets (created via API or Terraform) do not have this problem. The problem is the underlying Athena query that is different from the one that a legacy datasets perform during a refresh.
Example of Failing Query:
UNLOAD (/* QuickSight ... */
SELECT"tenant_id", "yard_document_entity_id", "yard_document_source_id", "status",
"date",
"changed_by",
"calculated_at"
FROM"AwsDataCatalog"."testing_data_lake_api"."yard_document_monitoring_process_history")
TO 's3://aws-athena-query-results-.../'
WITH (format = 'PARQUET', compression = 'ZSTD')
The error can be worked around by explicitly casting the timestamp columns:
UNLOAD (/* QuickSight ... */
SELECT"tenant_id", "yard_document_entity_id", "yard_document_source_id", "status",
cast("date"as timestamp) AS"date",
"changed_by",
cast("calculated_at"as timestamp) AS "calculated_at"
FROM"AwsDataCatalog"."testing_data_lake_api"."yard_document_monitoring_process_history")
TO 's3://aws-athena-query-results-.../'
WITH (format = 'PARQUET', compression = 'ZSTD')
Background:
-
The new Data Preparation Experience creates a new type of dataset that is not directly comparable to legacy datasets. See:
-
The root cause seems to be related to Athena Query Engine v3, which enforces timestamp precision checks. If the data has a higher precision (e.g., microseconds) than the table definition (milliseconds), the query fails with:
Incorrect timestamp precision for timestamp(6); the configured precision is MILLISECONDSReference: Athena Engine v3 Timestamp Changes
Our Situation:
-
At the moment it is completely unclear to us how and when the migration to the new datasets will happen and if we will be forced to migrate.
-
We are concerned that, once the migration occurs, our system will break due to this bug with timestamp columns in Iceberg tables. In the end every dataset of ours contains timestamp columns.
Questions:
-
Is this bug with timestamp columns in Iceberg tables on the QuickSight team’s radar?
-
Will this issue be fixed before the migration to the new dataset format is enforced?
-
Is there an official workaround or best practice to avoid this problem, aside from manually casting all timestamp columns? I guess we would have to create views for every table casting every timestamp to a timestamp which would be wild.
Any information or guidance would be greatly appreciated, as this issue could potentially cause system outages once a migration is triggered without this issue being resolved.
Thank you!