Thousands of SPICE Dataset Refresh Via API

Similar question to

Our implementation has thousands of datasets that are stored in SPICE, currently we refreshed the multiple datasets directly via Quicksight’s API. We thought that it might be a good idea that instead of triggering a refresh ourselves, we could schedule the refresh of all the datasets lets say at 1am. Now the questions would be:

  • Is there an imposed out of the box schedule dataset refresh concurrency limit and how does this work? (for example if i have 50k datasets and i set them all to refresh at 1am would they be batched on a 50 by 50 basis or would the 50k datasets be triggered at once?)
  • In case there is a concurrency limit or something similar can this be configured via Quicksight’s API? (a link to any documentation would be really helpful)

I’m open to feedback or other ideas regarding thousands of dataset being refreshed (particularly towards scheduling or refreshing strategies Quicksight API wise). If i have 50k+ datasets and want to refresh them on a daily basis what would be a recommended approach (we’re already taking into account incremental refresh)?

  • Currently with our implementation when we trigger the multiple SPICE Dataset refreshes we were hitting a HTTP 429 Too Many Requests reason why we had to limit it to just 50 refreshes at a time.

Thanks to everyone in advance

2 Likes

Hi @ges214 - This is an excellent question, we really need to understand the limitation of SPICE in terms of number of parallel session. Is it possible for you to submit a ticket to AWS Customer support with the error details so that they can validate and provide the recommendation. To raise a ticket, please follow the link -

Regards - Sanjeeb

1 Like

@ges214 - This would be really good to know if there’s any such hard limit at QuickSight end. Please keep the community posted in case you hear back from the AWS Case Support engineer. Just wanted to share how we are managing at our end. We are not relying on scheduled refresh, rather we are using event driven SPICE dataset refresh trigger where we tie the CreateIngestion API call to the dependent ETL flow itself which populates the data source. Sharing the Blog post which kind of outlines similar concept for your reference. Hope this helps!

1 Like