Sum of unique max?

We are using Quicksight to summarize time series data, and we’re running into an issue that seems conceptually straightforward, but is proving difficult to model.

For the dataset in question, some columns contain time series data values and some columns contain time-invariant metadata values:

  • Building name
  • Area (square feet)
  • Month (timestamp)
  • Expense category
  • Cost

Building name and area don’t change over time; operating expense does. The issue is that even though building size is a numeric value, it doesn’t sum up the same way that operating expense does. For example, a building that spends $1m per month spends $12m per year. But if the same building is 1m square feet in size, it isn’t 12m square feet per year. Size is invariant.

This causes problems when we mix these different numeric values together in equations. For example, we want to calculate operating cost per square foot. The simple way to do this is to create a calculated field:

{ cost } / { area }

This mostly works fine in a pivot table, except that the totals are wrong. The individual values get added together, when really what we want is a weighted average.

Alternatively, we can define the following calculated field:

sum({ cost }) / max({ area })

This also doesn’t work. Again, the individual values are fine, but the totals are wrong, because total costs are divided by the single largest building area. Again, what we really want is a weighted average. Something like this:

sum({ cost }) / sum({ area for each unique building in the set })

It’s that last bit that I can’t figure out how to calculate. I have tried various combinations of maxOver and countOver to get the unique set of building areas, but nothing has worked so far.

One nuance to highlight: there is more than one expense category for each building, which means, for example, that there will be multiple rows for Building A for January 2022. This seems to complicate the problem, because we want Building A’s area to be calculated once and only once for each calendar month, and so on for each building in the set.

Any advice?

To complicate matters, what if one adds an addition to one of your buildings 10 year after it was built. Do you need to keep track of the area before the addition and the area after you receive the occupancy permit? Do you need to keep track of the cost upto the date of the addition and a different cost after the addition was completed?
I too have a similar issues with a different use case involving time. Somewhat related it the balance of the account as of certain date or involving a date range (with a date range, you want the last value in the date range, not the sum of the account balances for each day in the date range).

That is an issue, but I’ve actually simplified the example case in a number of ways because for the time being I’m really hoping to better understand how to use aggregate functions in situations like this. Once I get the base case sorted out, hopefully I can elaborate it.

Sorry if you’ve tried this before but can you do

sum(sumOver(cost,[building name],PRE_AGG)) / sum(sumOver(area,[building name],PRE_AGG)) ?