How to avoid duplicates when joining Fact and Mapping tables in QS

Prav · September 18, 2025, 10:26pm

Hi All,

I’m facing an issue with duplicate records when joining a Fact table to a Channel mapping table in Quick Sight.

Sample Data:

Fact Table (Fact_Sales):z

	B	C	D	E	F	G	H	I
Business	Version	Year	Company	Channel	Brand	Group	Month	Value
Biz1	V1	2025	C1	ChX	Br1	G1	Jan	100
Biz1	V1	2025	C1	ChX	Br1	G1	Feb	200
Biz1	V1	2025	C1	ChY	Br2	G2	Jan	50

Channel Mapping (Channel_Map):

	B	C
Channel Plan	Channel Group	Channel
PlanA	Retail	ChX
PlanB	Corporate	ChY
PlanB	Employee	ChY

Brand Mapping (Brand_Map):

	B	C	D
Brand	Series	Code	Description
Br1	S1	G1	Product A
Br2	S3	G2	Product B

Problem:

Common Column Channel

When I join the Fact table with the Channel mapping on Channel, rows with ChY get duplicated because in the mapping table ChY belongs to both Corporate and Employee groups.

So instead of Value = 50, I end up with Value = 100.

Constraints:

I cannot make changes to the source files.
I can only work with Quick Sight dataset joins and calculated fields inside SPICE or Analysis.

What I’ve tried so far:

Created calculated fields to normalize keys (e.g. trim/upper) → joins work but still duplicate values.
Tried dividing values by a count of duplicates → works sometimes, but breaks in some groups.
Looked into filters at dataset level → but that removes some valid mappings.

Question:

What’s the best practice in Quick Sight to handle this kind of one-to-many mapping without changing the source files?

Should I solve this in the dataset join itself (if so, how)?
Or is there a recommended way with calculated fields to avoid double-counting?

Thanks

andrepgn · September 19, 2025, 8:09am

Hi @Prav ,

you can have different approaches here I think.

First one can be change the way you retrieve the data. Are those data in a DB that you can query?
If so, you can create a custom query, removing the duplicated lines BEFORE getting the data in Quick Sight.

Approach 1: eliminate the field Channel_Group (this assumes that this field is the only one that can have different values in the rows you want to merge)

SELECT DISTINCT f.*, cm.Channel_Plan
FROM Fact_Sales f 
LEFT JOIN Channel_Map cm ON f.Channel = cm.Channel

This will result in something like this.

Business	Version	Year	Company	Channel	Brand	Group	Month	Value	Channel_Plan
Biz1	V1	2025	C1	ChX	Br1	G1	Jan	100	PlanA
Biz1	V1	2025	C1	ChX	Br1	G1	Feb	200	PlanA
Biz1	V1	2025	C1	ChY	Br2	G2	Jan	50	PlanB

Approach 2: aggregate the Channel_Group field in a single field (same assumption as before regarding the changing values of the fields)

SELECT 
    f.*,
    cm.Channel_Plan,
    ARRAY_JOIN(ARRAY_AGG(DISTINCT cm.Channel_Group), '/') as Channel_Groups
FROM 
    Fact_Sales f
    LEFT JOIN Channel_Map cm ON f.Channel = cm.Channel
GROUP BY 
    f.Business, 
    f.Version, 
    f.Year, 
    f.Company, 
    f.Channel, 
    f.Brand, 
    f.Groups, 
    f.Month, 
    f.Value,
    cm.Channel_Plan

This will result in something like this:

Business	Version	Year	Company	Channel	Brand	Group	Month	Value	Channel_Plan	Channel_Groups
Biz1	V1	2025	C1	ChX	Br1	G1	Jan	100	PlanA	Retail
Biz1	V1	2025	C1	ChX	Br1	G1	Feb	200	PlanA	Retail
Biz1	V1	2025	C1	ChY	Br2	G2	Jan	50	PlanB	Employee/Corporate

Approach 3: eliminating the doubles in Quick Sight.
You can do this by, e.g., selecting all the other fields and not the Channel_Group one, and then using an Average aggregation on the Value field.

Let me know if some of these might help!
Andrea

Prav · September 19, 2025, 10:51am

Hi Andrea,

Thanks a lot for the detailed explanation and examples — very helpful!

I do have a follow-up: in my case, I must keep Channel Group in the output since the business requires reporting broken down by Retail, Employee, Corporate, etc. Because of that, Approach 3 (dropping Channel Group and averaging the values) doesn’t work for me.

Unfortunately, I can’t modify the source files (the data comes in as CSVs), so I don’t have the option of running SQL to apply DISTINCT or ARRAY_JOIN as in Approaches 1 and 2.

Is there any way in Quick Sight itself (either in the dataset prep layer or with calculated fields) to achieve something similar to Approaches 1 or 2 — basically deduplicating the join so each Fact row only maps once, while still keeping Channel Group available for reporting?

Thanks again for your help!

Prav · September 22, 2025, 9:21am

The issue is resolved. Made a work around in the Spice layer on one of the Datasets..

Thanks

Praveen

andrepgn · September 22, 2025, 9:49am

Hi @Prav,
can you sorry for being late, but happy to see you solved the issue!
Can you explain what you’ve done for future reference to the Community?

Andrea

Topic		Replies	Views
2 Fact tables in one dataset Q&A error , data-preparation	15	1947	September 11, 2023
Values being duplicated when joining two tables Q&A data-source , data-preparation	2	1486	August 31, 2022
Multiple Duplicate Rows getting created on join function Q&A analysis , filters , dataset , calculations , Business-Intelligence-Engineer	6	240	October 24, 2024
Many to many without duplicating data? Q&A	1	532	August 27, 2023
How can I remove duplicate from a table after Left join in Quick Sight? Q&A data-source , calculations	3	2726	October 20, 2022

How to avoid duplicates when joining Fact and Mapping tables in QS

Sample Data:

Problem:

Constraints:

Related topics