Slow loading times

We have a dataset that is ~ 4GB in size, with 2,356,786 rows, all in SPICE.

It takes a very long time to load many of the analysis parts, sometimes even longer than 60s and then times out.

What is the underlying technology behind QuickSight’s SPICE database? Is it using Athena with files in S3?

I am trying to think of ways this could be optimized.

Any tips for resolving and debugging this?

Seems like a 4GB dataset should not be this slow.

Hi @m0ltar - Can you please share couple of details. The performance analysis is always tricky :slight_smile:

  1. What is the data source for your data set?
  2. Do we have any calculated field, if yes, need to understand and do analysis on the logic part.
  3. are you using any custom sql. If yes, can we test the performance of the sql in database directly to see whether the issue is at query side or data base side.
  4. are you using any join? if yes can you please check there is NO M:M relationship maintained or not.

By saying the above, yes I do agree we should understand the internal architecture of SPICE and how the data has been stored in SPICE and process as well.

Regards - Sanjeeb

1 Like

It does not matter what the DS is.

SPICE is a cache.

When SPICE is enable, no requests are made to the underlying data source.

Yes, we have. Too many to list.

You cannot use custom SQL with SPICE.

Cannot use joins with SPICE.

We do not have performance issues when loading data from the source.

The performance issues are when analyses (and dashboards) load the data from SPICE.

1 Like

Hi @m0ltar

you use use Custom SQL with Spice. Custom SQL depends solely on whether the data source allows SQL queries while Spice is about how the results of the Custom SQL are cached/stored. You cannot use Custom Query because you are NOT connecting to a sql-based data source (for example flat files stored on your machine and uploaded in Quicksight), and NOT because you are using Spice. Example below.

And you can do any JOINs in Spice with data coming from any data source. Example below. Spice actually is the only way you can do joins when data comes from different data sources, say flat files and Redshift or Teradata.

Regardless, once in Spice the loading should be fast. One common issue that can affect performance is how you design the visuals: if you have tables, for example, and you are loading the entire dataset, say 2MIL+, and then have many added calculations then that would take a few seconds. But it takes time not necessarily because of Spice but also because of browser limitations. Imagine that in one table, in any browser and in any BI tool, I try to load - right at ingestion time - billions of rows; even if the user can only see 10 or 50 or 100 at one time, the entire data has been loaded into that visual and stored in the browser’s caching memory even if only a fraction is actually visible at one time). The browser (not Spice) will either crush or will take a very long time.


Check please your dashboard design. Make sure you load only pre-filtered data into a visual and then allow the user to get any slice they want via filters or parameters. If everything checks out, then please raise a ticket with the Support team because this is definitively not the expected behavior.

Hope it helps,
GL

1 Like

You can see if any of the tips in this article helps improve the performance for you:

2 Likes

@gillepa:

you use use Custom SQL with Spice. Custom SQL depends solely on whether the data source allows SQL queries while Spice is about how the results of the Custom SQL are cached/stored. You cannot use Custom Query because you are NOT connecting to a sql-based data source (for example flat files stored on your machine and uploaded in Quicksight), and NOT because you are using Spice. Example below.

I am not saying SQL cannot be used to load data into SPICE.

I meant to say that I cannot use SQL to make queries to SPICE’ed datasets. I.e. I cannot use SQL against SPICE engine. That part is abstracted away from us.

Likewise here. I meant I cannot do JOIN SQL in SPICE. Yes, I know I can use a UI to make the join conditions. But then it’s up to QuickSight engine to optimize those joins. I have no way to affect the JOIN query myself.

I don’t think QuickSight loads billions of rows into the browser. And if it does, then it is entirely a problem from the QS side, not from the author or user side. It should not be loading the entire dataset, and instead it should be loading it page by page.

The error we are seeing is this:

So the data isn’t even loaded into QS.

It gets aborted before then.

So it’s not a browser issue. It is entirely a QuickSight backend issue.

I’ve seen this error before when the visual contains too many complex calculated fields. If that’s the case, try removing all calculated fields from the visual, add them back one by one and observe at what point it breaks. Also, see if there’s anyway you can simplify your calculated fields or maybe even consider moving them to the dataset.

2 Likes