Feature Request - Specifying Field Names Better

Sean_Middlehurst · July 5, 2024, 8:33am

Say you have two datasets that you wish to join together, “Dataset1” and “Dataset2”, but the datasets share a field name (not necessarily the field used to join the dataset together, and could contain wildly different data). Say this field is called “SimilarField”. In the new dataset created from the two original datasets, there will be a field called “SimilarField” and another field called “SimilarField[Dataset2]” to represent the field with the same name. Sometimes it would instead have “SimilarField” and “SimilarField[Dataset1]”; I’ve not worked out which field Quicksight will take as priority.

The issue that this creates is it’s often difficult to tell which parent dataset a field is from. Well, naturally you’d assume that if you’ve got SimilarField and SimilarField[Dataset2], then the first field will be from Dataset1. But this gets harder to determine when you’ve got 4-5 or even more datasets all joined together, and only some of them have a SimilarField. This is especially the case for Author users who only have User access to the child dataset, so can’t verify what fields each parent dataset has.

A bigger, but more niche issue that this gives, is say you have two datasets, ChildDataset1 and ChildDataset2, where ChildDataset1 is made from joining Dataset1 and Dataset2, and ChildDataset2 is made from joining Dataset3 and Dataset4. Dataset1 and Dataset3 have identical fields, and so do Dataset2 and Dataset4, but the queries are filtered differently. However, ChildDataset1 has fields called SimilarField and SimilarField[Dataset2], and ChildDataset2 has fields called SimilarField and SimilarField[Dataset3]. To clarify:

SimilarField on ChildDataset1 is from Dataset1.
SimilarField[Dataset2] on ChildDataset1 is from Dataset2.
SimilarField[Dataset3] on ChildDataset2 is from Dataset3.
SimilarField on ChildDataset2 is from Dataset4.

SimilarField (from Dataset1) and SimilarField[Dataset3] are pulling the same kind of data, as areSimilarField (from Dataset4) and SimilarField[Dataset2], but the two pairs of fields are pulling very different data (for the sake of argument, it’s the same datatype, but different values than you’d expect).

If I had an analysis built from ChildDataset1 using SimilarField, saved it as a new analysis, and replaced ChildDataset1 for ChildDataset2, it would map without error, but SimilarField on ChildDataset1 would map to SimilarField on ChildDataset2, where I would prefer it to map to SimilarField[Dataset3]. Since there is no verification step when the fields map without error, a user could overlook this incorrect mapping easily and clone the report erroneously.

I believe this issue would be fixed if every SimilarField specified its parent dataset, for instance, if in the example, ChildDataset1 ha s SimilarField[Dataset1] and SimilarField[Dataset2], rather than one of them having an unspecified name. The only issue I can see from this would be with field name displays (e.g., if a table has SimilarField with no renaming, and it one day just changed to SimilarField[Dataset1], it may look unappealing). This could be solved by renaming each changed field to what it originally was in the visual, or what may be a lot easier, to only apply this change to new datasets.

Xclipse · July 5, 2024, 9:15pm

Hi @Sean_Middlehurst, besides using SQL – I’m out of ideas. Sorry, currently this is not possible but I’m marking this for feature request. At AWS, their roadmap is primarily driven by customers. Your feedback helps them build a better service. I have tagged this as a feature request. More features are being added on a regular basis, so please keep an eye on the What’s New / Blog . You can set up a Watching Alert by clicking on the bell icon.

You could look to opening up a case with AWS and see if they can provide a solution.

Here are the steps to open a support case . If your company has someone who manages your AWS account, you might not have direct access to AWS Support and will need to raise an internal ticket to your IT team or whomever manages your AWS account. They should be able to open an AWS Support case on your behalf. Hope this helps!

David_Wong · July 5, 2024, 9:50pm

Hi @Sean_Middlehurst,

I’ve run into this issue too. I think the priority is based on the order in which the tables were added to the dataset. If you hover over the field name, you’ll see what table it comes from.

Xclipse · July 5, 2024, 10:26pm

Thank @David_Wong – that’s good to know.

Sean_Middlehurst · July 8, 2024, 7:25am

Hi @David_Wong,

Thanks for verifying my issue. I’m not 100% sure that the order theory is true. Basically, if we consider my previous example, I would have created ChildDataset1 and ChildDataset2 in the same order, so ChildDataset1 would have started with Dataset1 and joined Dataset2, and ChildDataset2 would have started with Dataset3 and joined Dataset4, but the SimilarFields would be swapped as described.

Come to think of it, the above example may not have been exactly as I’d described. For the sake of this, let’s say ChildDataset2 was created first. It could have been that when ChildDataset2 was created, Dataset3 didn’t have SimilarField and was added later, hence why the SimilarField for Dataset3 has the brackets and Dataset4 doesn’t. That’s the only way I can imagine this example would have occurred if the order logic is correct. Ideally, it would be good if a Quicksight dev could confirm this.

Regardless of the above, this is still an issue, and since the vast majority of my datasets are based on SQL and I could rename the queries for each dataset, but since I create many datasets by copy-pasting queries and amending the filters, it would be a pain to rename the fields in the query each time, plus it would remove the consistency of these similar datasets. That, and a lot of my analyses are also copy-pasted, and the replace dataset feature is very helpful if there are no manual field substitutions needed.

Thanks @Xclipse for reporting the issue. I can imagine it’s more of a bug fix than a feature, or at least something that wouldn’t be heavily advertised on release notes, but I’ll look out when creating datasets if anything changes.

Xclipse · July 9, 2024, 3:16am

Agreed, also marking/tagging as both “Feature Request” and “Error”