Q topic not properly working with multiple datasets

When multiple datasets are added to a Q topic, there seems to be a challenge with answer generation. The system tends to prioritize the first dataset, even when the relevant information exists in other datasets. This occurs despite adding multiple synonyms to the fields.

Even when the query contains fields that are not present in the first dataset, the system still attempts to retrieve information from it rather than searching other datasets that contain the relevant fields. This results in incorrect or incomplete answers.

Is there a specific question format or pattern that should be followed to ensure the system retrieves information from the appropriate dataset?

Hi @cgairik and welcome to the QuickSight community!
I setup a small testing example and based on my wording and available fields, it seemed to be working for me alright but I’ve had issues with field usage at times when there’s only 1 dataset present.
I’d be curious to hear a bit more about your scenario, could you share some additional information on an example you have where you received this behavior? What type of question did you ask, what fields were used in the answer, what fields were expected to be used in the answer (maybe a screenshot like below of the field setup for the fields being used/should be used as well for further clarification)

As I was able to test out, I don’t think it’s something with the UI that’s defaulting to the first dataset so hopefully with so hopefully with some additional information, we can try and assist with troubleshooting options!

I have 2 datasets. The 1st dataset contains information related to users’ different demographic details as well as their enrollment timestamp information. The 2nd dataset contains only different enrollment related information. Now whenever I am asking a question related to enrollment details of a user, it is always trying to fetch the same from the 1st dataset not from the second, though there is only a single field related to enrollment, but there are a lot of fields for enrollment information in the second dataset, but it’s not fetching data from there to answer such questions. I have added proper enrollment related synonyms as well in the second dataset to prioritize that dataset over the other one. But it did not work.
I had created named entities as well from both of those datasets and added proper naming as well to get proper answer of the question

Hi @cgairik,
Topics take a decent amount of time to refine as it involves a lot of working through how the system is pulling a field based on basic text that gets used. It’s difficult to assist without providing more detailed examples; so when you ask a question about enrollment, how detailed is your question? Once you’ve asked a question, you can see how the system interpreted it:
image

One thing you could explore, if the system is using dataset 1 because of your one field, ‘enrollment timestamp’, you can change the friendly name for that field to something that does not include the word enrollment. That way, when you ask a question about enrollment, it should default to your dataset 2 that includes enrollment information instead of the first dataset.

One limitation that, while I can’t find included in any documentation, I’ve found to be true through testing out topics is…while you are able to add multiple datasets to a topic, when forming a question, it will only use one dataset to answer the question (whichever one it feels is best fit to answer). I do not believe it has the ability to build visuals off fields/columns from 2 different datasets, much like in an analysis visual.

So my thoughts for your situation are that however the question is being worded, it’s linking to the enrollment timestamp field and then trying to work within the boundaries of that first dataset (as it can’t pull fields from both dataset 1 and 2 for the answer).
Does the enrollment timestamp field contain necessary data for the answer or would the data from dataset 2 be sufficient to answer on its’ own?
If your answer will need the timestamp AND information from fields on dataset 2, I believe you’ll need to perform a table join and use that dataset on your topic

Hi @cgairik,
Following up here as it’s been awhile since we last heard from you. Were you able to find a work around for your question to provide a proper answer or are you still working on this?

If you have any additional questions, feel free to let us know. If we do not hear back within the next 3 business days, I’ll close out this topic.

Thank you

Hi @cgairik,
Since we have not heard back, I’ll go ahead and close out this topic. However, if you have any additional questions, feel free to create a new post in the community and link this discussion for relevant information if needed.

Thank you