Pivot Table and Count_distinct error

Juan · October 3, 2023, 4:29pm

hello, i am dealing with a problem with the pivot table, basically the sum of the total usd is not working, when i apply count_distinct(company_id), the total does not take into account that distinction, and gives me the total of count(company_id), which contains duplicates of company_id, how would i fix this? i have a similar problem for a cohort calculation, and both have in common that the table does not have the same total as the company_id when i use count_distinct

here, for example, for September I have 2844 users with count_distinct, but the amount is equal to a common count, including duplicates, and the correct individual amount is 10.49 usd, calculation that if count_distinct works correctly, it should be 2844 * 10.49 = 29857.46, instead of the amount shown(47826.08)
please if anyone has an answer I hope you can comment, as I have checked up to central demo and have not found an answer.

duncan · October 4, 2023, 3:16pm

Hello @Juan !

What is the outcome that you are looking for?

To me it looks like the pivot table is working as expected. In your distinct count aggregation on the companyId field well, it is coalescing all orders for each Id and summing them, then adding that sum to the total of amount_in_local_currency.

For example, these two tables ultimately say the same thing:

CusotmerId - Sales
101 - 10
101 - 10
101- 10
301 -10
201 - 10

CustomerId -Sales
101 - 30
301 -10
201 -10

Count of Ids = 5
Distinct Count of Ids = 3
Total Sales = 50

Juan · October 4, 2023, 5:52pm

Hi @duncan! thanks for the reply
My goal is that for both September and August, the total dollars match the average price(por p1-1 in this case should be 10.49), because if you look at the tables, for the rest of the months, the amount in dollars decreases considerably, since you are correctly performing the operation distinct_count of companies * price of plan;
another thing, is that the total numbers, or total of rows, do not match with the total amount per plan and month, why is that?

duncan · October 20, 2023, 7:32pm

Hello @Juan , I’m sorry about the late reply!

Are you still running into this issue or were you able to find a work around? I’m still not sure i understand what the problem is.

For company_id are those unique values per row that you are running distinct_count on?

duncan · October 26, 2023, 2:48pm

Hello @Juan !

Are you still having trouble with this issue or were you able to find the solution?

Juan · October 26, 2023, 4:17pm

Hi @duncan, I was not able to solve the problem, I have the same problem in another analysis, and is that when I add fields, for this case plan and months_suscribed, when I see the total, or sum of rows, is not the same number as if I do it manually, I take the case of Sep 2023, if you add the field company_id(count_distinct) is not the same value as the one that delivers the image above in which I do not apply (count_distinct), and this affects also in amount_in_local_currency.
This problem I assume is because I have duplicate data for the company_id field? I don’t know how to solve so far

David_Wong · October 26, 2023, 4:28pm

Hi @Juan,

Instead of putting sum(amount_in_local_currency) in the Values field well, can you try creating a calculated field like this and use it instead? This should take care of the amounts introduced by
your duplicate company_ids.

sum(min(amount_in_local_currency, [company_id]))

If it doesn’t work, can you show a screenshot of your raw unaggregated data so that we can see where the duplication is coming from?

Juan · November 2, 2023, 2:23pm

Hi @duncan
Those company_id are not unique values per row, i had the same issue with a cohort analisis, both show diferent results when applying count vs count_distinct, and the problem is when you sum individually the count_distinct table, or applying filter, the values are diferent