Redshift (dims) + S3 (facts) → Quick Sight SPICE (~500 GB) takes ~8 hours daily; incremental not available due to linked datasets. How to cut refresh time?

Hi all,
Our model has fact tables in S3 and dimension tables in Amazon Redshift. In Quick Sight, we link multiple datasets from these sources to build one dataset, which lands at ~500 GB in SPICE and currently takes ~8 hours to refresh daily.

Constraint

  • Because we’re linking datasets from different sources inside QS, the Incremental refresh option doesn’t appear on the dataset

For teams running large daily refreshes, what got your SPICE refresh down from many hours to something manageable?

Thank you

Aiman

Hi @aiman and welcome to the Quick Sight community!
As you’ve seen in your scenario, Quick Sight may not be the most efficient way to handle these joins. Have you considered handling this prior to pulling everything in to Quick Sight? Since you’re already using Redshift, I would suggest handling the joins there so that you can pass along the full dataset as one into Quick Sight, therefore allowing you to utilize an incremental refresh.

Let us know if you have any additional questions

Thank you Brett for your reply, is there an alternative way to process the data, ex. Airflow?

Hi @aiman,

I do not have much experience personally working with airflow but if you’re able to handle the table joins within airflow and then bring the joined table in to Quick Sight, you should still be able to apply incremental refreshes to the dataset.

Hi @aiman

It’s been a while since we last heard from you. If you have any further questions, please let us know how we can assist you.

If we don’t hear back within the next 3 business days, we’ll proceed with close/archive this topic.

Thank you!

Hi @aiman

Since we have not heard back from you, I’ll go ahead and close/archive this topic. However, if you have any additional questions, feel free to create a new topic in the community and link this discussion for relevant information.

Thank you!