Hi,
I’m experiencing multiple dataset refreshes with a time interval of 5 minutes. Due to the large amount of data in each dataset, the refresh process takes considerable time. Consequently, the next dataset refresh begins before the previous one completes, creating a loop that extends the overall time required for refreshes. For instance, some datasets may take more than 1 hour to refresh. How can we address this issue?
I could think of two approaches …
-
You could try to schedule the incremental SPICE refresh frequency after considering average time taken to complete SPICE refresh . Say average incremental refresh 45 min then you could schedule the incremental refresh to run every 1 hr.
-
If multiple datasets needs to be refreshed along with dependency in a regular interval , then would recommend to orchestrate the SPICE ingestion using AWS Step Functions or Amazon Managed Workflows for Apache Airflow .
Thanks
VInod
Hi Vinod,
Thanks for your help. I think the second solution would work. Let me try that once.
Thanks a lot
Regards
Praveen Kumar