Incorrect 90th Percentile Calculation Due to Duplicated Rows

Purushothaman · April 7, 2025, 6:57am

Hi QuickSight Experts,

I’m encountering an issue while trying to calculate the 90th percentile of “Days to Deliver” in my dataset. The metric is being affected by duplicate rows due to data model joins.

Sample Dataset Structure:

DeliveryUniqueKey	Days to Deliver
D1	2
D1	2
D1	2
D2	4.5
D2	4.5
D3	3.3
D3	3.3
D4	7
D5	9
D6	12
D6	12
D7	11
D7	11
D8	19
D9	9
D9	9

What I’m Trying to Do:

Calculate the 90th percentile of Days to Deliver based on unique DeliveryUniqueKey values.

Issue:

Since each DeliveryUniqueKey appears multiple times (due to joins), my current calculations are giving me the wrong result:

Expected 90th percentile = 13.4
QuickSight returns = 12 or 19

What I’ve Tried:

Using raw measures:

percentileDisc({Days to Deliver}, 90)
percentileCont({Days to Deliver}, 90)

Calculating average per unique key:

sum({Days to Deliver}, [DeliveryUniqueKey]) / count({Days to Deliver}, [DeliveryUniqueKey])

But the duplicate rows are still being considered in the percentile calculation.

Question:

How can I correctly compute the 90th percentile in QuickSight by deduplicating DeliveryUniqueKey first (i.e., treating each key once with its correct delivery days)?

Appreciate any help or workaround you can suggest!

@duncan @ytakahr @Giridhar.Prabhu @Rahul_Easwar

Thank you!

duncan · April 7, 2025, 7:35pm

Hello @Purushothaman

The main issue here is deduplicating the rows. I would recommend creating a calculation on the dataset layer to mark duplicate vs original rows and then filtering the data on the dataset layer.

Here’s an example:

DeliveryUniqueKey	Days to Deliver
D1	2
D1	2
D1	2
D2	4.5
D2	4.5
D3	3.3
D3	3.3
D4	7
D5	9
D6	12
D6	12
D7	11
D7	11
D8	19
D9	9
D9	9

DeliveryUniqueKey	Days to Deliver
D1	2
D1	2
D1	2
D2	4.5
D2	4.5
D3	3.3
D3	3.3
D4	7
D5	9
D6	12
D6	12
D7	11
D7	11
D8	19
D9	9
D9	9

DeliveryUniqueKey	Days to Deliver
D1	2
D1	2
D1	2
D2	4.5
D2	4.5
D3	3.3
D3	3.3
D4	7
D5	9
D6	12
D6	12
D7	11
D7	11
D8	19
D9	9
D9	9