Sum of measure for distinct records

hi there,

I have a dataset which includes duplicate records, including a few fields. I need sum of ‘weights’ measure for each option of Q1 however I need sum of weights for distinct records.

To give some details, 1dcount column hold distinct count (unweighted) and 1wcount should hold weighted counts, desired counts shown below -

Please use link below to access dashboard & dataset

Dummy 1

Hi @jitendral.yadav,

In regards to summing the weights; i would suggest incorporating a max or min function. Since you’re dealing with duplicate values per record, if you grab the max or min (will be the same value since you’re dealing with duplicates), that will narrow down each record to the weight value as opposed to multiplying it by the count of rows in your dataset.

Let me know if this could work for your scenario or if you have any additional questions

1 Like

Hi @Brett ,

Thanks for looking into this.

I tried using “sum(maxOver(WEIGHTS,[RECORD,Q1,GROUP],PRE_AGG))” but this gives me sum of duplicate weights.

@jitendral.yadav try using just max and then specifying the group by level. I took a copy of your Dummy analysis and when I updated the code to be this: sum(max(weight, [ResponseId])) the numbers seem to be much closer. Interestingly enough the Q2 numbers in the analysis match the H1 numbers in your screenshot so I don’t know if it entirely fixed the problem but these numbers seem to be trending in the right direction of getting closer to your provided numbers

3 Likes

Hi @mbradley ,

Thank you! This is exactly what I needed.

But just wondering what was the issue with calculation I was using (i.e. sum(maxOver(WEIGHTS,[RECORD,Q1,GROUP],PRE_AGG)).

1 Like

sum(maxover) won’t give the correct results in the pivot table because of 2 reasons. 1. It runs before the data is grouped for the visual since you specified PRE_AGG and takes the max of each row in the group you defined and then 2. Sums those numbers together for the number of rows present causing the inflated numbers. If you wanted to use maxover and truly get the ‘max’ value for that grouping you would actually need to do max on your maxover so it would be max(maxover….).

Here is a visual example that might help provide some detail. In the top table, you can see I have added all the columns in the dataset and for a single response id there are multiple rows. When sum(maxover) runs, this is how it is looking at the data since it is running PRE_AGG. So even though you identified the partition levels there are multiple rows for this Response/Wave/Q1 group so it is first identifying the max value per row of your group and then summing together the rows for the grouping you identified which caused the inflated numbers. If you just use max this function is not PRE_AGG so it only evaluates the max once per group. It essentially does it in a slightly different order and identifies the max across the group, groups the data and then sums it and since grouping the data by those 3 columns only results in one row it only ‘sums’ together one row rather than 5. I hope that helped provide a little clarity and didn’t make thinks more confusing!

2 Likes

Thank you so much @mbradley for clarifying my doubts. It makes sense now.

1 Like

Hi @mbradley ,

I have an additional request to show %s as well along with counts. So sum(max(weight,[ResponseId]) gives me correct counts, is there a way I can get a quarter wise total as denominator? i.e. For Happy I need 621.85/ 1407.47 for 25Q2 likewise for other options.

Thanks in advance.

You can actually do this without a second calculation. You want to add 1wcount to the value field well as a column again so you will essentially have the same column twice to start. Then on the second one you just added, hit the 3 buttons to expand your menu options and select ‘Add Table Calculation’ > ‘Percent of Total’ and this will do the calculation for you and then you can rename the value headers in the Properties formatting pane. One note is that the Percent of Total defaults to Percent of total ‘across’ or for the row so to change that you’ll want to open up the extra menu options again and go to ‘Calculate as’ and here you can change how the visual calculates the percent of total and select ‘Table Down’ to give you the Quarter Total as the denominator. Hope this helps!

Hi @mbradley ,

I tried above method you mentioned and it gives correct output for first table. But second tables isn’t correct. For example, if you look at below screenshot for Accuracy it should be approx. 30% but I am getting 11%.

I tried the calculation “sumOver(sum(max(weight,[ResponseId])),[WAVE])” but it gives me sum of all the rows which holds true for mutually exclusive rows but not for overlapping responses/ rows.