Avoid double counting distincts in timelines visuals

In my Quick Sight dashboard on AWS Amazon Enterprise Edition, I’ve successfully counted the distinct values of a field named ‘Business’ for businesses that attended an event during the financial year. The count appears accurate when using the KPI visual, which correctly calculates the distinct values.

However, when attempting to represent this data over time using a Clustered bar combo chart, Quick Sight aggregates the distinct counts for each month independently, resulting in double-counting businesses that were previously counted in earlier months within the year. This leads to inflated totals when summing up the counts for each month.

To resolve this issue, I need to ensure that Quick Sight counts each business only once based on its first occurrence within each month. This way, I can avoid double-counting and accurately represent the distinct counts per month without inflating the totals.

Could you please provide guidance on how to achieve this within Quick Sight without resorting to external SQL aggregation or creating additional datasets?

Hi @VincDBT

did you try a group-level within the count?

distinct_count({business}, [{your_date_field}])

BR

Hi @VincDBT , @ErikG ,

If you plot Distinct Count of Business on a bar chart against date, it will give you the distinct count for each date partition. However, here Vin is plotting it on a clustered combo bar chart. ie - there is another dimension (lets say DimX) in play for the bars. In that case, default behavior is to count distinct businesses for each date and DimX partition. Of course, the overall count at the bar level will now have businesses counted multiple times.

Counting over just date partition will result in same customer count being repeated across all the vertical segments of the bar.

You face these issues as Distinct count is a non additive measure and hence can’t be rolled up to higher levels.

While using just the first customer record within each month can help fix the overall total for each month, it will be reporting the numbers incorrectly for each segment within the bar.
That being said, there can be cases wherein this is exactly what you want to do - Like may be a chart showing first orders that customers made each month partitioned over days of week.
In such cases, you can use something like the below calc

ifelse(
           {Order Date} = minOver({Order Date},[truncDate('MM',{Order Date}), {Customer}],PRE_FILTER), 
           {Customer}, 
           NULL)

Regards,
Arun Santhosh
Pr Quick Sight SA