I would like to join SPICE and query mode datasets because I want to display a couple of columns from the query dataset in real-time without any delay.
The majority of my data is in the SPICE dataset(45GB), and only two columns from the query dataset need to be joined(500MB). My question is: Will this join impact the performance of the SPICE dataset, or will it run as usual while only taking the additional time required to fetch data from Athena via query mode?
Am I missing any best practices to join these datasets effectively? appreciate your time and response
what you are looking for is hybrid mode which is not supported . Visuals can either access a dataset for which the data is stored in SPICE or dataset in direct query mode.
1/ Have 1st dataset with data stored in SPICE for visuals which do not require latest data.
2/ Have 2nd dataset in direct query mode for visuals which require latest data.
Note : A visual can access data only from 1 dataset .
“1/ Have 1st dataset with data stored in SPICE for visuals which do not require latest data. 2/ Have 2nd dataset in direct query mode for visuals which require latest data. Note : A visual can access data only from 1 dataset .” What you’re saying is that if one dataset is in SPICE and the other is in direct query mode, the direct query dataset doesn’t fetch data from Athena to update the visuals every time the dashboard is opened. Instead, it waits for the SPICE dataset to refresh, and only during that time does the query dataset run and update the visual. Is my understanding correct?
I can add to the fourth scenario - If you observe the left most box, that says different data sources, when it is different data sources and direct join, by default Quicksight uses SPICE - (under this section - “Can a joined dataset use direct query” -Joining data - Amazon QuickSight)
Also, please correct me @Koushik_Muthanna if I am wrong, please correct me—in the above 4th scenario, when QuickSight considers it as SPICE, we can’t perform an incremental refresh, correct? I see that the only available option is a full refresh, with a minimum frequency of hourly.
@shravya@Koushik_Muthanna I have noticed that the Direct Query dataset I am joining with the SPICE dataset isn’t refreshing or pulling updated data, even after refreshing the spice dataset.
It only updates when I remove the Direct Query dataset from the SPICE dataset and rejoin it. Once a dataset is joined to another dataset using “Add Data,” it doesn’t fetch updated records until it is removed and rejoined. Am I correct?
Hi @lalprasanth - If you are joining both data sets from same data source, better to have either direct query ( in case of real time or near real time reporting). If you are doing SPICE mode, you can set as incremental one. However if your data sets are from different sources ( like one from redshift, another from aurora). and when you do join by default the mode will be SPICE as internally QS will bring the data to SPICE and then do the joining. The details provided by @Koushik_Muthanna is really good and at present QS support the above joining approaches.
Hi @lalprasanth,
Since we have not heard back, I’ll go ahead and close out this topic. However, if you have any additional questions, feel free to create a new post in the community and link this discussion for relevant information if needed.