User lifecycle analysis recommended approach

Dears,

I want to know the recommended method of visualizing lifecycle analysis for a mobile application.

To decide active, returned, and churned users, I relied on performing the calculations in Athena using SQL similar to below (full article: SQL for calculating Churn, Retention & Re-Engagement).

with monthly_usage as (
  select 
    who_identifier, 
    datediff(month, '1970-01-01', when_timestamp) as time_period
  from events
  where event = 'login' group by 1,2 order by 1,2),
 
lag_lead as (
  select who_identifier, time_period,
    lag(time_period,1) over (partition by who_identifier order by who_identifier, time_period),
    lead(time_period,1) over (partition by who_identifier order by who_identifier, time_period)
  from monthly_usage),
 
lag_lead_with_diffs as (
  select who_identifier, time_period, uses_outlook, lag, lead, 
    time_period-lag lag_size, 
    lead-time_period lead_size 
  from lag_lead),
 
calculated as (select time_period,
  case when lag is null then 'NEW'
     when lag_size = 1 then 'ACTIVE'
     when lag_size > 1 then 'RETURN'
  end as this_month_value,
 
  case when (lead_size > 1 OR lead_size IS NULL) then 'CHURN'
     else NULL
  end as next_month_churn,
 
  count(distinct who_identifier)
   from lag_lead_with_diffs
  group by 1,2,3)
 
select time_period, this_month_value, sum(count) 
  from calculated group by 1,2
union
select time_period+1, 'CHURN', count 
  from calculated where next_month_churn is not null
order by 1

I actually married the query above with a date dimension generation field using unnest (sequence), since some time intervals might be missing.

Following this I visualized in Quicksight using : 1- pivot tables (retention rate cohort analyis), 2- stacked bar chart (check percentage of each segment across time).

Do you think this is a correct approach? Is there any simpler way to do such analysis in QuickSight?

As a best practice you should try and put as much logic as you can into your SQL.

The reason for that is because if you use calculated fields like lags, and leads you need to also reference any of the partitions in the visual. This makes it limited because now every one of your visuals will need to have a reference to those fields, and sometimes you don’t want that.

You also can’t do certain SQL operations like unions in QuickSight.

Personally, I would recommend you keep it how you have.

Hi @ray1990. Did Max’s feedback help? We hope so. Let us know if you need anything else.

yes. Thanks Max for your answer. very helpful.

1 Like