Level aware sum / sumOver problem

I used your CSV file and still got the correct result. Your calculated field is not the same as mine.

Yours:
sumOver(total,[{rule_name},{event_date},{channel_id}],PRE_AGG)

Mine:
sum(min(total, [{rule_name}, {event_date}, {channel_id}]))

Did you try my calculated field? What was wrong with it? You don’t need to use sumOver.

1 Like

To understand why your calculated field doesn’t work, add it to a flat table visual. I highlighted all rows with ICD-10 below.

When you do min(custom_total_3), for ICD-10 QuickSight is taking the min from [1, 1, 81, 1, 1, 977, 2, 2, 8, 8, 6] which is 1.

If you change your min(custom_total_3) to sum(custom_total_3), it also doesn’t work because QuickSight will take the sum of [1, 1, 81, 1, 1, 977, 2, 2, 8, 8, 6]. The two 2’s will add up to 4 and the two 8’s will add up to 16 but you want those values to be 1 and 4 respectively.

1 Like

@David_Wong Thanks for the explanation.

sum(min(total, [{rule_name}, {event_date}, {channel_id}])) worked on the given example but didn’t work on other data.

here is the sample data where it didn’t work, not sure why.

Expected output
ICD-10 : 93, ICD-9 : 14, FDA Code : 20 total: 127

But, What I got is 86,14,19,119 respectively.

The Sample Data can be found here.

1 Like

Here’s your new dataset filtered by “FDA Code”.

I don’t understand why the expected result for FDA Code is 20 and not 19. All combinations of channel_id and rule_name are unique. If there aren’t duplicates, the result should be simply the sum of all the numbers which is 19.

1 Like

sorry @David_Wong, you are correct about FDA code…
Looks like a few data didn’t export when I exported from the QS dataset.
And I didn’t bother verifying the export data…

But according to the file I have shared, ICD-10 should be 89 but I am getting 86.
What do you get on your end?

highlighted the duplicate rows here

The reason why you’re not getting the expected result is because of these 2 rows:

The partition that we’re using contains the following fields:
rule_name
event_date
shared_with

This means that when QuickSight is doing the inner aggregation, it’s ignoring the owner_name field and taking the min of the two values which is 1.

To fix this, we have to update the calculated field by adding owner_name to the partition:
sum(min(total, [{rule_name}, {event_date}, {channel_id}, {owner_name}]))

image

Whenever you have to deal with aggregation problems like this one, I would suggest first putting all your fields in a flat table visual. That will help you see any duplicates in your data and the fields that are introducing those duplicates. You’ll then be able to identify the partition that you need to use in your level-aware calculation.

1 Like

Btw the only purpose of the min here is to exclude the duplicates. You can use max or avg, and it will give you the same result if your partition is correct.

2 Likes