Trouble with LAC, average of group average

I am getting wrong answers from a calculation and I’m not sure why. I have data at a user-transaction level, but it includes some information about the stores the users are visiting. Those store-level fields are then just repeated/the same for every store. So we might have 1000 people who visit store 1, and each of those 1000 records all have different attributes of the transaction, but there are a few fields that just describe store 1 - so like it’s opening date, for example.

I’m trying to calculate the average days between two of those dates about the stores, but I want to do the average at the store level. So I want to say “store 1 was approved on Monday, then opened on Wednesday… so it had a 2 day gap.” Then “store 2 was approved on Monday and opened on Friday, so it had a 4 day gap.” So then I want the “average store gap” to be 2, in this contrived case.

To do this, I’m able to make a calculation that gets the gap by store. That looks like this:

avgover(dateDiff({store_approved}, {store_opened}, "DD")), [{store_id}], PRE_FILTER)

I think this works because, again, the data are actually at the user-transaction level. So I am taking the average because all those “store approved” and “store opened” values are the same for any given “store id.”

But my problem is coming at the next step. When I go to take an average of this measure, it produces a wrong answer. I think the reason it’s wrong is that it’s not actually pumping the aggregated value into the next measure. So like if I were to do:

avg(avgover(dateDiff({store_approved}, {store_opened}, "DD")), [{store_id}], PRE_FILTER))

I get a wrong value. I can see why if I change that “avg” to “sum”… I get a huge number - basically, I think, the sum of all those “gap” calculations across all the user-transactions for that store. Instead, I want it to be, from my example, 6… the sum of 4 and 2. It seems like the “avgover” calculation isn’t outputting data at the “store_id” level, like I’d expect, and is instead keeping the data at the user-transaction level.

Is there some way around this or something I’m missing here?

Hello @toomanypuppies, if the first function is working properly for the approved to opened gap for each store, what you could do to get the total average is run the avgOver function with the partition by brackets empty. It would look something like this:
avgover(dateDiff({store_approved}, {store_opened}, "DD")), [], PRE_FILTER)

That would theoretically average per transaction, but since it is the same for all transactions in a single store, that should give you the average of all stores. I hope that helps!

Thanks for looking at this @DylanM. Maybe I’m missing something, but it seems like that doesn’t work.

Here’s an example:

image

I’ve got the formula you shared in the TEST column. I think that’s now reporting the average gap across all transactions.

Here, the expected value for TEST Gap is 549.2. That’s (1+10+38+826+1871) / 5. Note that I’ve filtered this example here so they’re all in the same month. But I am really looking for it to get the right average of those day gaps for each month.

Hello @toomanypuppies, ah I see. The issue here is that the number of transactions for each grouping is going to manipulate the average. Since there are significantly more rows of data for the gaps of 1, 10, and 38 it is greatly reducing the average.

I honestly think the simplest way to manage this would be to make a new dataset where your data is grouped by store_id instead of transaction_id. You would just need a row for each of your stores, with the Approved Date, Opened Date, and you could add a count of transactions to keep your table as it is. Then the 2 functions you are using now would give you the Day Gap and test gap how you are expecting.

1 Like