Hello everyone,
I’m working with a dataset containing transactions in a comercial sector, every row is a transaction with a userid, ticket_amount, transaction_date, etc. I’m trying to obtain the anual median expenditure by user.
For this, first I calculate the sum of ticket amounts grouped by userid and year:
1- anual_user _spend=sumOver({ticket_amount},[year, userid], PRE_AGG)
And then I either represent it in a visual against year
and aggregating it as median
Or I directly calculate it in a new field:
2-median_anual_spend=median(sumOver({ticket_amount}, [year, userid], PRE_AGG), [year])
I have data for 2022, 2023 and 2024, no matter how I calculate it (option 1 or 2). I always get values that are 3.46 times higher than I should (for each year) (based on working on the same dataset with pandas)
For example:
Quicksight
2022-----548,34€
2023-----618,17€
Calculated elsewhere
2022-----157.61€
2023-----178.68€
I have also made a table in quicksight with fields userid, year, anual_user _spend and median_anual_spend, exported it, and performed:
df.groupby(‘year’)[‘anual_user_spend’].median()
And obtained the correct value (quicksight result/3,46)
I don’t know what else to do to get the correct values