I have a dataset that contains duplicate rows based on the “Event ID” primary key. I would like to keep only one row for each unique Event ID, and discard any additional rows that have a duplicate Event ID.
The dataset has several other columns beyond just the Event ID, but I want to focus the deduplication process on just the Event ID column.
The challenge is that I am unable to perform any ETL work on the dataset. I need to find a way to remove the duplicate rows directly within the QuickSight platform, preferably at the dataset level, to make it easier for the business users to create their analyses.
What is the best approach to identify and remove these duplicate rows in QuickSight, keeping only a single row for each unique Event ID?
Thank you in advance for your help!