Star Schema vs. Flat Wide Table

I’m new to QuickSight. Our team uses S3 and Athena as our data sources. We plan on pulling our data into SPICE.

Is it better to do all my joins upstream (i.e. in Glue/Athena) and connect to one flat table where I could have 30,40,50 columns if I include all of my dimensions, or should I bring in my facts and dims to the dataset and do the joins in QuickSight, thus creating a STAR schema?

What are the advantages to each approach?

@Joseph_C_Seroski : This is a very interesting question. If you are planning to join fact and dimension tables and create one data set and the data set is used for many users, possibly SPICE is a good solution. All you need to refresh data set once after ETL to populate SPICE and users can take the data from memory instead of hitting Athena ( each api is a cost :slight_smile: )

QuickSight has a flexible to join data sets but it is not meant for building data marts, if you have complex joins and summary requirements, better to handle out of QuickSight. Please do some POC and see the performance indicators… if you do not use SPICE, QuickSight will do a direct query which will have 2 mins timeout.

Regards - San

1 Like

Hi @Joseph_C_Seroski
The AWS recommended approach for this scenario is to create the required views in Athena, apply all the business rules and clean up of the columns at the view level so that the heavy lifting is done at the database level.
In addition to the above, there are certain limitations when it comes to the data usage and capacity.

Please refer the below links for additional details:

Thanks,
Naveen

2 Likes

@namysore OK, so you recommend a wide, flat table, then?