AVERAGE and MEDIAN of distinct count per user?

Hello,

I have the following table (4 users, 7 properties, 4 mortgages)

How can I calculate the AVERAGE and MEDIAN of distinct properties per a user?

For example I want to get the following results:

AVG = (2+1+1+3)/4 = 1.75
MEDIAN … 1/1/2/3 = 1.5

User_id | Property_id | Mortgage_id | Property_count
U123 | P1 | M1 | 2
U123 | P1 | M2 | 2
U123 | P2 | M3 | 2
U124 | P3 | M4 | 1
U125 | P4 | NULL | 1
U126 | P5 | NULL | 3
U126 | P6 | NULL | 3
U126 | P7 | NULL | 3

I was trying to have smth like the below but it doesn’t work (It says error in calculations)

avgOver
(
sum(distinct_count({Property_id})), [User_id]
)

Hi @Vadym, are you using amazonbi account or other public preview account? We are rolling out a feature called Level-Aware-Calculation should be able to solve your problem. The functions will be available to all users soon around end of June. If you are able to use public preview, you can try the following functions: distinct_count({property_id},[{user_id}]), and drag that calculated field to the visual and select “average” as visual aggreagation. I re-produced your problem using a different dataset below, FYI (replacing industry by user, and property by order)

Hi @emilyzhu , thanks for your response. Unfortunately it doesn’t work or i misunderstood the idea. Please see screenshots

This is the calculated formula i used

Hi @Vadym, can you share the calculation formula for “id[properties_PROD]”?

sorry, id[properties_PROD] is not the calculation formula, basically this is just a bad name in the real table. in my initial example this is just Property_Id, so just a unique ID per a property

so in my original table

id = User_Id
id[properties_PROD] = Property_Id

does it makes sense?

the best I can go is to use

distinctCountOver({id[properties_PROD]}, [id], PRE_AGG)

to count distinct PROPERTIES per USER

and then see MEDIAN and AVG here

But the issue is that this returns MEDIAN from all distinct properties not from distinct users :frowning:

I also was trying to Google and i found 2 similar unresolved issues from other QuickSight users

Interestingly i found a similar and resolved problem but from Power BI users :grin:

Hi @Vadym, sorry for the later reply.

We just launched the Level Aware Calculation feature for all users. Can you try the approach above again, using this function distinct_count({id[properties_PROD]},[{id}] )? I suspect the feature was not enabled for you at that time.

thanks @emilyzhu i will try this again and respond next week. meanwhile i have found a solution. for everyone who encounters this issue you can use this tutorial i drafted in Notion. also would be great if you can verify it @emilyzhu

@Vadym, That’s awesome! Your approach is correct, it uses the previous available functions- since distinctcountover creates duplication, you have to use rank and ifelse to de-dup the results. That is exactly why we are launching this new LAC feature. You can now directly use distinct_count(measure, [level]) to get the correct number at the [level] dimension, no need to de-dup anymore.
Please try the new approach and let us know how you think about it. Thanks!!

@emilyzhu i have just tried your solution, unfortunately this does not display any values at all when select MEDIAN

The below i did:

Solution 1 (the best) - ifelse(rank)

ifelse(

rank([{id[Properties]} ASC], [id], PRE_AGG) = 1,

{… Distinct Properties Per User},

NULL

)

Solution 2 (initial with duplicates) - distinctCountOver

distinctCountOver({id[Properties]}, [id], PRE_FILTER)

Solution 3 (yours) - distinct_count

distinct_count({id[Properties]}, [id])

You can see the result in the following picture

Hi @Vadym - Can you please help me calculate median of sales_amt in a year/week (ignoring duplicate entries for each sales_id and sales_amt.

image. I tried Notion – The all-in-one workspace for your notes, tasks, wikis, and databases. but looks like i am getting unexpected result. Not sure if i am missing something.

I am getting below unexpected result - I tried both avg/median and in both cases i am 1,1 instead of


125, 125

Calculated_Fields

  1. Distinct Count ==== distinctCountOver({sales_amt},[{sales_id}],PRE_FILTER)
  2. Ifelse Rank ====
    ifelse(
    rank([{sales_amt} ASC], [{sales_id}], PRE_AGG) = 1,
    {Distinct Count},
    NULL
    )