Take max value by partition and then sum the max value of each partition

Hi!

I have a data set with the following columns (these columns are just for illustrative purposes - my actual columns are different):

the_week,shape,color,count,customer_name,location

The count changes based on shape and color, but not based on customer_name or location.

What I want is a calculated field {Total count} that gives me the sum of the maximum value of count for each group (the_week, shape, color).

This calculated field must work even if all columns are not present in the visual. For instance, if the visual is a table with only the_week as grouping column, then {Total count} should be the sum of max counts within each group (the_week, shape, color) for that week.

Here’s some example tables showing what I’m looking for. The current Total count is way off.

I’ve tried many different things, none of which have been successful. My current calculated field has this:

sum(
    ifelse(
        rank([{the_week} DESC], [{the_week}, color, shape], PRE_AGG) = 1,
        maxOver(
            count,
            [{the_week}, color, shape],
            PRE_AGG
        ),
        0
    )
)

Thanks in advance!

Hi @jonathan1 - can you try this:

sum(max(count, [customer, the_week]))

Ill mark the post as resolved but feel free to re-open if it this didnt work for you and we will figure it out :slight_smile:

1 Like

That worked, thanks a ton @Jesse! I’m amazed at how simple it turned out to be :laughing:

I did run into another issue though - It works great, except for when I use it in a visual and that visual renders an ā€œOtherā€ category. In this case it seems to sum up all the separate results from the Other category. I need the ā€œOtherā€ category to be the maximum of any location (=1673 for Nov 5 in the below example), not the sum of them (=11 711).

1673 * 7 = 11 711

  • Total count for Nov 5: 1673
  • Amount of locations part of ā€œOtherā€: 7

Calculated field total_count:
sum(max(count, [{the_week}, shape, color]))

HI @jonathan1 That might be a little tricky. How are you showing just the 2 locations with ā€˜Other’ - is that another calculated field you wrote or are you using the formatting options to only show 2 colors? You can hide the other category if you dont want it, but if you do want it we can try this:

ifelse(distinct_count(location, [location])>1,
max(max(count, [customer, the_week])),
sum(max(count, [customer, the_week]))
)

I think we’re in 3 levels of aggregation territory once you want to max the items in the other, but we cannot next multiple LAC-A’s together at the moment.

@Jesse Ah, I see. I used the formatting option to only show two colors. In my actual data set the same thing happens, except that the threshold is much higher, say ~20-30 (it has > 100 distinct categories for the corresponding column).

In my actual dashboard I have an on-sheet filter that looks something like this:

The actual value I’m presenting is cost per unit by location, and if I hide the Others category, what happens is (of course) that only a subset of all selected locations are shown. Because of this, someone might, as an example:

  • Select 15 different locations
  • The visual might show only 10 of these
  • The 10 it shows might all be very low cost, whereas the omitted ones are very high. But by looking at the graph, it looks like there are no high cost per unit values.

I tried this, but it gives me

Mismatched aggregation. Custom aggregations can’t contain both aggregated and nonaggregated fields, in any combination.

ifelse(
    distinct_count(location, [location]) > 1,
    max(max(count, [{the_week}, shape, color])),
    sum(max(count, [{the_week}, shape, color]))
)

Do you have any other suggestions? I take it that I might have to settle with hiding Others and using a complementary table to list the cost per unit for all selected locations.

I think we’re in 3 levels of aggregation territory once you want to max the items in the other, but we cannot next multiple LAC-A’s together at the moment.

Do you know whether this is something that’s on the roadmap?

Thanks again!

Sorry for the error on the calc, try like this:

ifelse(
    distinct_count(location) > 1,
    max(max(count, [{the_week}, shape, color])),
    sum(max(count, [{the_week}, shape, color]))
)

Have you thought about using Small Multiples since the cost per unit can be so different between locations? You can choose an independent or shared Y axis for each location. This is using independent:

ifelse(
    distinct_count(location) > 1,
    max(max(count, [{the_week}, shape, color])),
    sum(max(count, [{the_week}, shape, color]))
)

The above lets me save the calculated field, but the visual complains:

The Level Aware Calculation (LAC) aggregate expressions doesn’t work with other category calculation at this moment. Please remove the LAC aggregation expression from metrics or visual dimensions, or turn off the default other category calculation in visual options

Small multiples doesn’t help me in this case. I think I’ll just hide the Others category and use a table to show all the values in addition to the line chart, unless you have any other suggestions I could try :sweat_smile:

Either way, thanks for all the help, I very much appreciate it!

I think Im out of ideas - apologies. I think your last suggestion is what I would go with :slight_smile:

1 Like