How to fill in zeroes for dates with no values

Hello,

I have an Athena dataset with users who have an email as their userid, and a lastSessionDate field for the last time they have signed in. I am plotting distinct_count(email) vs datediff(lastSessionDate, now()) to visualize how long users are spending between signing into our application.
I am able to get this visualization working, but I can’t find a way to have the visual display zero for the dates where nobody signs in. I think I might be able to generate a generic series of dates (either in Athena or Quicksight itself) and then join them into the dataset to do this. Does anyone have any experience with this type of problem and would this be the correct way to approach this?

Hi ruiponte1990,

Can you test if creating a calculated field like this resolves your issue?

ifelse(lastSessionDate = NULL, 0,datediff(lastSessionDate, now()))

Hi Peter,

Thanks for replying but no that doesn’t make a difference. The problem is not that there are users without sessions, every user will have a last session date, even if it is null they are just showing as null because they have never signed in. The problem is the x axis which shows the datediff skips right over days where no one has signed in. I am trying to find a way to have it automatically fill these days with the number zero.

I was able to create another dataset with custom SQL that solved this issue.

select ds.* FROM ( WITH all_sessions AS
( WITH unique_dates AS
(SELECT version FROM {MY_TABLE_NAME} WHERE version != ‘latest’)
, session_data AS (
SELECT email, lastSessionDate FROM {MY_TABLE_NAME}
WHERE version = ‘latest’
)
SELECT
a.version,
b.email
FROM unique_dates a
LEFT OUTER JOIN session_data b on a.version = b.lastSessionDate
GROUP BY a.version, b.email
)
SELECT version, COUNT(DISTINCT email) AS userCount FROM all_sessions GROUP BY version ) ds

The version is updated to a new date every day (unless it is the latest). So by using this as the input to the datediff I was able to solve this problem. The left join will be null for version dates with no session dates, and a distinct count of NULL will be zero.

This is not ideal however because there are fields in the original dataset I still would like to pull in. So if anyone has a better idea please share!

you should be able to add the missing fields to your session_data select clause (add missing tables as needed). We have done that bringing both attribute (group by) and measures to the right side of the join.
You’ve actually done the hard part, just keep pushing the envelope, and you’ll get there! :money_mouth_face:

1 Like