Q topic indexing for too long

The Q topic indexing has been going on for more than 8+ hours for one of my datasets. While the dataset is huge - 589 GB, it is in SPICE.

Is this data size not recommended for a Q topic? I have tried optimizing my dataset with deselecting some columns and show value fields in suggestions for most.

Let me know if any specific details are required.

Hi @Sanjay1 and welcome to the QuickSight community!
While a Q topic may be able to accept that size of a dataset, it may not be the most practical way to utilize as Q creates a topic index to generate answers to your questions. So a dataset of that size will take much longer to ingest.
As there’s no expected indexing time table based on dataset sizing, it’s hard to say what the expected time is but if it’s taking that long, other topic updates made may lead to a large loading time as well.

I would suggest that the best practice in this case would most likely be to try and remove the fields that you won’t be using in the Q topic from the dataset prior to adding to a topic.

Let me know if you have any additional questions!

So while creating the dataset for the topic, I should exclude the not required cols and then build the dataset and from it the topic. Is my understanding correct?

Also, is this define unrecognized terms feature still available? I am not able to select and define terms as shown in the example - (Correcting wrong answers in Amazon QuickSight - Amazon QuickSight)

Hi @Sanjay1,
Yes correct. Depending on the number of fields being removed, this should reduce your loading time. Although it’s hard to say by how much without testing.

In regards to that feature, I believe it’s been moved to a different spot, the functionality may be a bit different as well. Once you ask a question, you’ll receive a bar underneath where you can edit the fields, see below:

1 Like

Hi @Sanjay1,
It’s been awhile since we last heard from you so following up to see if you had any additional questions?

If we do not hear back within the next 3 business days, I’ll go ahead and close out this topic.

Thank you!

Hello Brett, thank you for following up.

Reducing the number of columns prior to dataset creation did improve the performance.

Additionally, I have a question regarding the incremental loading of our dataset, which occurs either monthly or quarterly. I’m wondering about the impact this might have on the q topic indexing, specifically - Will the indexing of the q topic refresh every time after each incremental load?

Hi @Sanjay1,
I believe that you may need to refresh the dataset within the Q topic which can be done from the section shown below (You can also prompt the refresh from here). Once updated, there should not be any additional steps needed to index the data as well.

1 Like

Thank Brett for the response!

1 Like