Union Datasets within dashboard

Chirag_Karkera · June 7, 2023, 11:08pm

I have a dataset in Quicksight which has grown very large and is exceeding the Enterprise size limits. This dataset is already being used in a dashboard. In order to resolve this, I am thinking of splitting the dataset into smaller parts. Most of the views in the dashboard need the complete data so I would need to combine all the multiple parts of this dataset for using them in the dashboard. Is there a way to union the data of these multiple datasets within the dashboard? Or is there an alternative/workaround to resolve the dataset size limits issue.

eperts · June 7, 2023, 11:42pm

Hi @Chirag_Karkera - I’m guessing you’re hitting the SPICE limits of 1B rows or 1TB of data.

If you were to split (or partition) the data then I wouldn’t join it back together in QuickSight. I would probably look into storing it on S3 and then query it via Athena or RedShift Spectrum. If the partitioning method is efficient, combined with compression can achieve good query results. For partitioning the data using Glue is an option.

That being said, the QuickSight Team is working on increasing the SPICE limits so depending on your timeline you may have another option coming up in the near future.

Chirag_Karkera · June 8, 2023, 3:08pm

Thanks @eperts for your inputs. I am extracting the data which is stored in a Redshift cluster into SPICE. I don’t want to have a live connection to the Redshift table since that would put a lot of load on the cluster. Is there any other method of extracting data into SPICE? Also, if that’s not possible, what’s the timeline to increase the SPICE limits? This will help me to set expectations for my clients.

Karthik_Tharmarajan · June 8, 2023, 3:28pm

Hi @Chirag_Karkera, We recently launched the ability to pass values during runtime that will help greatly reduce the load on direct query datasets. For your case you can use direct/live query mode however reduce the volume of data being queried in RedShift using a parameter in the SQL used for the dataset. Please refer to below link and see if adding dataset parameters helps in your use case.

Regards,
Karthik