We have been creating datasets and dashboards in QuickSight for analyzing historical data. But now we have are thinking of creating an Operational Dashboard to monitor passenger data in airport with a 24-48 hrs horizon. The data is to be picked from the Transactional database which is changing every minute.
We have used SPICE datasets and I am aware that the refresh can be setup upto 15 minutes, but not earlier than than.
Direct Query datasets on transactional data could take time to produce results and in a multi-user usage case may create too many queries on the transactional database
Additionally, dashboard may end up showing different numbers to different people based on when they refreshed data
I want to have a consistent data, that is refreshed say every 5 minutes, data is in QS dataset is consistent for all users
I know this is a general description of a use case. But just wanted to know if anyone has done something of this kind and what approach did you take?
The consistency of the dashboard will be based on how often the data is refreshed, even if you have the data in SPICE, different users accessing the same dashboard might see different data depending on when the oppened/refreshed the dashboard.
To have a static dataset that is refreshed every 5 minutes, my suggestion is to create an ETL job that refreshes a table with the relevant data every 5 minutes and that QucikSight access with direct query. Another possibility will be to create a materialized view that is refreshed every 5 minutes. Both these solutions have the disadvantages that direct query has regarding sessions and database access, but at least you can have a very specific dataset which can be tailored specific for this use case, as well as the refresh of the table/view needs to run in less than 5 minutes.
Another architecture that might help with this is to create an analytical database only for QuickSight access to segregate transactional work from analytics workloads. Depending on your underlying database you could use Zero-ETL to transfer data, or if using Redshift you could use data shares.