I am excited about the possibilities of how a very well optimised Iceberg dataset can enable faster read speeds for my large Google Analytics dataset (about 30+ columns with 300+GB for 3 months worth of data, I need to setup for at least 2 years worth of data), in particular that last graph in that guide showing the potential read time savings.
I am however concerned as I realised our GA dataset in S3 / Athena (imported from Bigquery) have already been written in Iceberg format.
Setting up a test Quicksight dashboard connecting via Athena, my simple line / bar combo chart failed to load for 3 months worth of data.
I have tried connecting QS direct with all columns in that Athena dataset, and another QS dataset direct connecting the same Athena dataset, but only pulling 3 columns via custom SQL.
Both results didnt yield much difference.
I know there’s SPICE, but for my use case, I am doubtful SPICE can have enough space before we get charged.
Is there some kind of best practices in play, where after creating a highly read-optimised ICEBERG dataset, I can have a great QS experience for my users by having QS configured correctly to make full use of what ICEBERG is offering, as prescribed by that AWS documentation?
QuickSight when used with direct query is sending SQL queries back to your query engine. The query engine is then responsible in generating the results and the speed at which those queries complete also depend on the query engine. BI queries tend to be ad-hoc, so is the ICEBERG optimization done at your end able to solve most of the common query patterns for your use-case ? .
Setting up a test Quicksight dashboard connecting via Athena, my simple line / bar combo chart failed to load for 3 months worth of data.
Is there some kind of best practices in play, where after creating a highly read-optimised ICEBERG dataset, I can have a great QS experience for my users by having QS configured correctly to make full use of what ICEBERG is offering, as prescribed by that AWS documentation?
Which part in the documentation are you referring when asking about configuration. Note there is no configuration from QuickSight apart from creating the data source connecting to Athena.