Calculated field for repeat visitors who performed certain actions in the previous 90 days

Hi team,
I am working on a dashboard where I have a metric ‘Repeat visitors’ which follows the below conditions:

  1. If a vistor which has made a visit within our selected time range (suppose Jan-Feb 2024) AND
  2. If in a rolling 90 day period before the selected range (Oct-Dec 2023), the visitor has done the following:
    1. Made latest 2 visits to Page P of the website OR
    2. Made latest 3 visits to Page L of the website AND looked at at least 2 products

If 1 AND (2.1 OR 2.2) are satisfied then count the visitor as a repeat visitor otherwise not.

I know that we would need to utilize combinations of if else & distinct count over functions, however I am running into the Mismatched aggregation error. Additionally, the dataset I am working is 40M rows and is expected to grow over time. Using multiple count overs would massively increase the load time.
My current calculated fields:

date_filter_rolling90= ifelse({request_date} >= addDateTime(-90,'DD',{date_filter_start_date}) AND {request_date} <= addDateTime(-90,'DD',{date_filter_end_date}),1,0)

condition1: coalesce(distinct_countIf({visitor_id},{date_filter}=1),0)

condition 2.1:
coalesce(sumIf({number_of_visits},{date_filter_rolling90}=1 AND {page}='P'),0)

condition 2.2:

coalesce(sumIf({number_of_visits},{date_filter_rolling90}=1 AND {page}='L'),0)
coalesce(distinct_countIf({product_id},{date_filter_rolling90}=1),0)

Final condition is basically distinct count of visitor_id where condition1 is true AND either condition 2.1 or 2.2 is true. However, I am unable to put it together in the correct format. I wanted help in figuring out how (or if it is even feasible) to implement this in Quicksight?

Hi @Pankhuri,

In your condition 2.1, are you counting the number of visits to page P in the 90 day period? Why are you using sumIf to count? What does count_of_client_log_request_id represent? Is that the count per day?

If you’re also counting visits in your condition 2.2, why is the calculation based on a different field (number_of_visits instead of count_of_client_log_request_id)?

Would you be able to share a few rows of your dataset so that i can understand the granularity? I can’t tell if 1 row of your dataset represent 1 visit, 1 day or something else.

Hi David,
Apologies for the confusion. I have updated the field name to ‘number of visits’ in 2.1.
The dataset is pre-aggregated hence a sum operation is applied to the ‘number of visits’. The number of visits are aggregated at a date, visitor, product type, page type level.

This is a screenshot of a mockup of the data, I wasn’t able to figure out how to upload the file itself so adding a screenshot.

Hi @Pankhuri, besides using custom SQL – I’m out of ideas. Check out some of the functions. Also, try using the Amazon Q Assistant on the function documentation page.

Another approach is using Generative BI, you can use natural language prompts to create calculated fields within QuickSight, as shown in the following image. For more information about calculated fields in analyses, see Adding calculated fields.

In case you need further assistance with your problem, please create a sample dashboard with sample dataset showing your problem state using Arena and please create a new post, so we can look at it from a fresh perspective. (Details on using Arena can be found here - QuickSight Arena)