Filter not applied to the distinct count of records

Hi all,

I just incurred in an issue with QS for something which I thought was straightforward and I’d like to understand if I am doing something wrong here.

TLDR: I would like to use a visual on top which simply return the count of hits I will get in the main table. This is done by a distinct count of the user IDs. The issue is that the filters based on measures do not get applied to this visual.

Long version.

Let’s say that my initial data is something like this:

UserID Date Gender Purchases Sales
123 01/01/2023 Male €0 €100
345 01/01/2023 Female €100 €200
123 02/01/2023 Male €50 €100
123 03/01/2023 Male €50 €50
345 03/01/2023 Female €200 €0
345 04/01/2023 Female €150 €500

Very straightforward: aggregated daily data with some user information.

My dashboard is just a table with a few filters on top.
I can select a date range and I will get the total amount of purchases and sales in that period, plus I can filter on gender and (this is the core of the issue), the minimum purchases or sales amount.
Say I can select the Date from 01/01/23 to 03/01/23 and I would get something like this:

UserID Gender Purchases Sales
123 Male €100 €250
345 Female €300 €200

I want then to create another visual with the distinct count of users returned here.
What I do is adding a KPI visual with the distinct count of userID applying the same filters. So I should simply get “2” in this case.
The issue now is with the filter based on the measures.
I create one more filter based on “Purchases”, and I select:

  • Aggregation: Sum
  • Filter Condition: Greater Than or Equal To -1 (this way by default it’ll always return all users)
  • NULL option: exclude

I add it to the sheet and pin it on top using the setting to use a text field here.
Keeping the previous filters, I then select “250”. My expectation is that it’ll show me only clients whose total purchases were at least €250 in the time window I specified.
My main table gets updated:

UserID Gender Purchases Sales
345 Female €300 €200

My new visual with the users count, does not.
The filter does not seem to have an effect.
Please note that:

  • The filter is correctly applied to the visual
  • When I use it, I can see the visual being recalculated, but then the result does not change

Am I doing something wrong here?
Any reason why the count is not being updated properly?
I am guessing that I am missing something really small and trivial because this is not a complicated task, but I cannot figure out what.

Thanks a lot.

So your KPI says 2 when it should say 1?

Can you duplicate your KPI and make it a table that list’s out the UserIDs?

Hello @Max,

First of all, thank you for helping me!
Yes, the KPI keeps showing the original number, in my example 2.

The filter has simply no effect so it shows all users, after the other filters have been applied. I also tried duplicating my original table, removing all fields, turning it into a KPI and adding the distinct count of users (to make sure the starting filters and conditions were the same), but still the same.

Thank you again!

Well, if you are removing the group then that might be the issue.

Can you do the opposite of what you did?

Go to the kpi, then duplicate that into a table? And then add the groups to the table from there?

Hi @Max,

Following your suggestion, I once again created a KPI visual, added the count of distinct users, then checked that all filters were applied.
I then duplicated it and changed the KPI into a table, moving the userID in the group by and saved.
When I apply any filters which are not based on aggregated measures, everything gets updated accordingly. But once I apply one filter based on these measures, this not get applied.
Not sure why the grouping here matters: I am selecting the distinct count of userIDs so even if in the time window I select the userID appears ten times, it should be counted only once.
I also tried another experiment: I created a bogus calculated field whose only value is “matches”, then used the table to group by this field and again count the distinct occurrences of userIDs, but nothing changed.
Am I misunderstanding something on how the logic works?

Thank you again for your time, it is much appreciated.

Don’t know if this is happening, wanted to see if you could see it with the excerise above.

But the example I thought of was that a userId is constant across groups.

UserId Gender Purchases
1 Male $100
1 Female $200

If you take away the gender it becomes summed to $300 which would then show a distinct count.

One thing you might also look at, is instead of summing the purchases averaging them.

But even so if you have an issue like the above ^ you should look into that logic first.

Hi @Max,

Sorry, I am not sure I am following you.
Each customer has a unique UserID.
So in your example, the amount would not get summed and become $300.
It would be still showing two users, 123 and 345, and the distinct count of UserIDs would be two. But if I also applied a filter on the amount, say “$200” to the aggregated sum, then while the table would only show the user matching this minimum, the distinct count would still return two.

I could look into the average but the point is that I need the distinct count of users whose total amount is X, and at the moment QuickSight is not providing that and I have no clue why.
And this seems such a standard scenario that I cannot wrap my head around it.

Thank you again for taking time to help me.

I would try filtering on a calculated field like this:

sumOver({Purchases}, [UserID], PRE_AGG)

I believe the filter on the regular Sum won’t work as you have it, because the aggregation is taking place before the filtering.

When you have it in a table, you are essentially using the calculated field above (By placing UserID in the group by)

You can read about the order of evaluation here:

Essentially, because your filter involves an aggregation method (Sum) It filters post aggregation.

Hi @MKuhns01,

Sorry, I am not sure I do understand.
As of now, my table with amount sum seems to be working fine.
I did not find any duplicates or anything, it only shows the individual users and the total purchases / sales amounts for that period.

Please note that I DO have sumOver functions in the table but I do not see how these should be related to my issue. My sumOver functions are basically like the one you mentioned, to calculate the lifetime purchases etc. per user. Something like this:

UserID Gender Purchases Sales Lifetime Purchases Lifetime Sales
123 Male €100 €250 €50 €250
345 Female €300 €200 €450 €700

So I can show both how much a user spends in a certain time window, but also how much they spent in their lifetime.

But again, the problem is with the distinct_count function. Not sure if this is because it is an aggregated function but it does not make sense to me that I do not have a way to calculate the distinct count of users whose amount sum hit my filters thresholds.
It is like an SQL function in which DISTINCT COUNT only takes into account the WHERE clauses but not HAVING.

At this point I am getting the feeling that I am not doing anything wrong but it’s a hard limit of QuickSight, something I was not expecting since, to me, it’s pretty basic asking the tools to answer questions such as “how many unique records do we have whose total amount sum in this timeframe was at least €X?”.
Apologies if I am misunderstanding and it’s something I am doing wrong.

And of course, thank you for the help!

You are not utilizing this correctly from what I can tell.

I recreated the data set you posted above, and was able to return a distinct count of 1 within a KPI visual utilizing the method I recommended.

Here is my result

Here is the calculated field I am filtering on:

image

And here is my dataset:

image

This will still work even if you are using a calculated field to aggregate UserID like this:

distinct_count(UserID)

and then placing that in the values field well. It will still return 1.

From my experience, the order of operations is very important to understand in order to be successful when building these visualizations.

The date filter applied on the visual will filter out the dates wanting to be excluded in the Sum. So instead of us wanting to see “Life Time Purchases” we are returning all Purchases made between 2023/01/01 and 2023/03/01 for each user, and then filtering out the users who don’t meet the threshold.

If you change the PRE_AGG to PRE_FILTER, the date filter won’t have an effect.

1 Like

Hi @MKuhns01,

Thank you so much for taking so much time educating me.
I now understand what you mean, but I am not sure I understand why it works this way.
It seems to me that the new sumOver calculation basically does the same exact as sum does in this case, am I correct? I ran a few experiments and I always got the same results. But the difference is that if I use it in my filter, then the distint_count returns what I am after, as long as I keep that in this case the aggregation is the minimum.
I agree that it is important to understand the order of operations here, it just eludes me why sum is not working and this is bugging me.

Thank you again for helping a stranger in need.

Ciao,
Massimo.

The reason is that sum by itself doesn’t differentiate between groups on its own. This is determined in part by which visual you select.

When you add it to the KPI, the filter by Sum({Purchases}) is adding all of the purchases together within the dates, in this case to $400. It does not distinguish this by UserID.

Then, the filter you applied was trying to filter where that sum was less than $250, but in all cases it was larger than that ($400 for User 123 and $400 for User 345), so it returned both users.

If you apply the method you used to something like a table and add UserID in the group by, Quicksight now will calculate the Sum({Purchases}) for each UserID, and filter out the ones not meeting $250. That is why you got a distinct count of 1 on the table, but 2 within the KPI.

The reason we needed to use the SumOver calculation is because there is no group by within the KPI, so Quicksight has no way to distinguish how to group the Sum. We had to specify Sum by UserID within SumOver.