Quick Sight incremental refresh missing data from Athena view with partition projection + Lambda ingestion

Hi,
I’m seeing an issue where Amazon Quick Sight incremental refresh is missing data that exists in S3.

My setup

  • Logs are written to CloudWatch.

  • A Firehose stream delivers these logs to a raw S3 bucket in 5-minute intervals.

  • A Lambda function is triggered by new objects in the raw S3 bucket. This Lambda processes the logs and writes the transformed files into a processed S3 bucket.

  • The Athena table on top of the processed bucket uses partition projection (no Glue crawler).

  • I created a complex Athena view with multiple CTEs that filters data within a time bounds window:

    start_ts = date_trunc('hour', current_timestamp - interval '6' hour)
    stop_ts  = date_trunc('hour', current_timestamp)
    
    

    So the view only scans the last 6 hours of partitions.

  • In Quick Sight, I use this view as a dataset and set up hourly incremental refresh.

    • Incremental column: activity_ts

    • Window size: 3 hours

The problem

  • On the Quick Sight dashboard, some rows are missing.

  • If I change the view’s bounds from 6 hours to something very large (e.g. 9000 hours) and do a manual full refresh, the missing rows appear correctly.

  • That tells me the data really is in S3 and queryable by Athena — but Quick Sight doesn’t pick it up during normal incremental refresh.

Hi @jeet1o1 and welcome to the Quick Sight Community!
Is there a common pattern with the rows that don’t seem to be populating in Quick Sight? Maybe there’s another factor that’s limiting those rows from being incorporated in the refresh.

Hi @Brett ,
Issue is resolved the root cause was with my athena aggregation views they were not aligned with the functionality of the look-up column for the quicksight, resolved this by doing an hour group-by for the aggregation.
Thank you.

1 Like