How to avoid duplicates when joining Fact and Mapping tables in QS

Hi All,

I’m facing an issue with duplicate records when joining a Fact table to a Channel mapping table in Quick Sight.

Sample Data:

Fact Table (Fact_Sales):z

B C D E F G H I
Business Version Year Company Channel Brand Group Month Value
Biz1 V1 2025 C1 ChX Br1 G1 Jan 100
Biz1 V1 2025 C1 ChX Br1 G1 Feb 200
Biz1 V1 2025 C1 ChY Br2 G2 Jan 50

Channel Mapping (Channel_Map):

B C
Channel Plan Channel Group Channel
PlanA Retail ChX
PlanB Corporate ChY
PlanB Employee ChY

Brand Mapping (Brand_Map):

B C D
Brand Series Code Description
Br1 S1 G1 Product A
Br2 S3 G2 Product B

Problem:

Common Column Channel

When I join the Fact table with the Channel mapping on Channel, rows with ChY get duplicated because in the mapping table ChY belongs to both Corporate and Employee groups.

So instead of Value = 50, I end up with Value = 100.

Constraints:

  • I cannot make changes to the source files.

  • I can only work with Quick Sight dataset joins and calculated fields inside SPICE or Analysis.

What I’ve tried so far:

  • Created calculated fields to normalize keys (e.g. trim/upper) → joins work but still duplicate values.

  • Tried dividing values by a count of duplicates → works sometimes, but breaks in some groups.

  • Looked into filters at dataset level → but that removes some valid mappings.

Question:

What’s the best practice in Quick Sight to handle this kind of one-to-many mapping without changing the source files?

  • Should I solve this in the dataset join itself (if so, how)?

  • Or is there a recommended way with calculated fields to avoid double-counting?

    Thanks

Hi @Prav ,

you can have different approaches here I think.

First one can be change the way you retrieve the data. Are those data in a DB that you can query?
If so, you can create a custom query, removing the duplicated lines BEFORE getting the data in Quick Sight.

Approach 1: eliminate the field Channel_Group (this assumes that this field is the only one that can have different values in the rows you want to merge)

SELECT DISTINCT f.*, cm.Channel_Plan
FROM Fact_Sales f 
LEFT JOIN Channel_Map cm ON f.Channel = cm.Channel

This will result in something like this.

Business Version Year Company Channel Brand Group Month Value Channel_Plan
Biz1 V1 2025 C1 ChX Br1 G1 Jan 100 PlanA
Biz1 V1 2025 C1 ChX Br1 G1 Feb 200 PlanA
Biz1 V1 2025 C1 ChY Br2 G2 Jan 50 PlanB

Approach 2: aggregate the Channel_Group field in a single field (same assumption as before regarding the changing values of the fields)

SELECT 
    f.*,
    cm.Channel_Plan,
    ARRAY_JOIN(ARRAY_AGG(DISTINCT cm.Channel_Group), '/') as Channel_Groups
FROM 
    Fact_Sales f
    LEFT JOIN Channel_Map cm ON f.Channel = cm.Channel
GROUP BY 
    f.Business, 
    f.Version, 
    f.Year, 
    f.Company, 
    f.Channel, 
    f.Brand, 
    f.Groups, 
    f.Month, 
    f.Value,
    cm.Channel_Plan

This will result in something like this:

Business Version Year Company Channel Brand Group Month Value Channel_Plan Channel_Groups
Biz1 V1 2025 C1 ChX Br1 G1 Jan 100 PlanA Retail
Biz1 V1 2025 C1 ChX Br1 G1 Feb 200 PlanA Retail
Biz1 V1 2025 C1 ChY Br2 G2 Jan 50 PlanB Employee/Corporate

Approach 3: eliminating the doubles in Quick Sight.
You can do this by, e.g., selecting all the other fields and not the Channel_Group one, and then using an Average aggregation on the Value field.

Let me know if some of these might help!
Andrea

1 Like

Hi Andrea,

Thanks a lot for the detailed explanation and examples — very helpful!

I do have a follow-up: in my case, I must keep Channel Group in the output since the business requires reporting broken down by Retail, Employee, Corporate, etc. Because of that, Approach 3 (dropping Channel Group and averaging the values) doesn’t work for me.

Unfortunately, I can’t modify the source files (the data comes in as CSVs), so I don’t have the option of running SQL to apply DISTINCT or ARRAY_JOIN as in Approaches 1 and 2.

:backhand_index_pointing_right: Is there any way in Quick Sight itself (either in the dataset prep layer or with calculated fields) to achieve something similar to Approaches 1 or 2 — basically deduplicating the join so each Fact row only maps once, while still keeping Channel Group available for reporting?

Thanks again for your help!

The issue is resolved. Made a work around in the Spice layer on one of the Datasets..

Thanks

Praveen

Hi @Prav,
can you sorry for being late, but happy to see you solved the issue! :smiley:
Can you explain what you’ve done for future reference to the Community?

Andrea