Distinct Count not

Hi!

I have been stuck on this for a couple of days now, so I would appreciate any help. There several posts about distinct count but my problem seems unique.

I am analyzing sales for a two day sales event. I have 4 customer cohorts. All, Deal customers, new, and new + deal customers. And I want to use controls to cut the data in multiple ways. So I created the following parameters.

pg = biz unit and apb = brand.

I am using this formula to count customers:
Cust_count = distinct_count({customer_id})

I have a filter for each paramater. Then I have three additional filters for the cohorts. event participants, new custoemers and deal customers.

The calculation for total event and deal customer work fine, but it doesn’t work for counting the new customers:

event = distinct_countIf({customer_id}, {is_prime_day_cust} = 1)

deal customer = distinct_countIf({customer_id}, {is_event_cust} = 1 and {is_event_deal_cust} = 1)

new customers = distinct_countIf({customer_id}, {New Cust} = 1)

The result is about 3.5X higher than it should be.

Any thoughts?

Thanks,

Juan

Hi,

can you create a table with {New Cust} as the dimension and Cust_count as the metric?

If I understand your calculations correct, you should get two rows with 0 and 1 for {New Cust} and then the distinct counts of the customer-IDs.

Are these results too high, too? The sum of both the values should equal the count of all customers.

1 Like

Thanks!

This is what I see.

image

Left colum show the cust count adn the rigth colum shows the the unique count. If I run a query to check the numbers I should be 653,050 new customers.

I was trying different aggregation levels, so if I do this:

distinctCountOver({customer_id},[pg],PRE_AGG) I get 1.2M which is closer but still is twice as much as it should be. This tells me I might be missing something at the aggreation level.

Regards,

Juan

Hello @jjhernandezg - The calculation you are suing is going to give you the distinct count at the pg i.e. Business Unit level. Is this what you are looking for?

1 Like

Hi @sagmukhe !

I am doing an analysis on customer count looking at four cohorts all, deal customer, new and new deal customer. The query has customer ids by region, product group, category, sub_category, and so on.

I created parameters for each of the columns mentioned above to make the analysis dynamic. The filters are working because I see the counts go down, but they distinct count calculation is not accurate.

I know there are 653,050 new customers, so I don’t know why the distinct count is not working. If I use this formula distinct_count({customer_id}) and a new customer filter, I get 4.1M. This formula distinctCountOver({customer_id},[pg],PRE_AGG) returns 1.08M.
It seems that distinct count is double counting. It is not accurate even in a sheet with no parameters.

Regards,

Hi,
so in your raw data, you get the correct unique count?

Because for me it looks like, you have 1.6 million customer in your dataset.

Can you test another thing? create a new tab in your report, so we have no active filters. Create a visual ( KPI type) and insert {customer_id} as a value. As aggregation choose unique count.

without filters, another dimensions, etc. this shows us the complete unique count of the customers in your dataset.