Date filter not always including nulls

Hello,

Experiencing some strange behaviour with a full join and filtering by date.

I’m trying to show % active users for the period selected. E.g. if there are 100 users, and 50 of them have logged in within the selected date range, the % active users would be 50%. There are two datasets, “users-anon” and “login records” that are joined with a full join (this is so that we can get the total count of all users).

This is the calculated field for % active users (created in the analysis):

(
    distinct_countIf({id[users-anon]}, isNotNull({login date})) / distinct_count({id[users-anon]})
)

On the analysis I have a filter on login date with “null options” set to include nulls. I’m assuming this means users with no login will still be included in the filter?

Here is a dashboard to demonstrate the issue: Active user

There are 348 users in the dataset and 75k logins. Some of those logins will be from users not existing in the dataset, and some of those users will have no logins at all.

If you clear the filter, total users shows 348. If you set a filter, the total users goes down. However, it should remain 348, as the filter has “include nulls” selected, so all users should be included. Unless I have misunderstood this null option functionality?

If working as expected, is there any workaround, or any other way of getting the active user calculation required here?

I’ve added all the data to a table in the analysis and exported it to Excel. I can see that some users are included with null dates (no logins), but some aren’t, and I can’t figure out why…

Thanks in advance.
Sam

Hi @samellis, besides using custom SQL – I’m out of ideas. Sorry, currently this is not possible but I’m marking this for feature request. At AWS, our roadmap is primarily driven by our customers. Your feedback helps us build a better service. I have tagged this as a feature request. More features are being added on a regular basis, so please keep an eye on the What’s New / Blog . You can set up a Watching Alert by clicking on the bell icon.

Potential Workaround

  • Pre-aggregate the Data:
    • Create a pre-aggregated dataset where you count the number of logins per user before joining it with the users dataset. This ensures that all users are accounted for.

Example SQL for Pre-aggregation

Pseudocode (Syntax may vary)

SELECT
    u.id AS user_id,
    COUNT(l.login_date) AS login_count
FROM
    users_anon u
LEFT JOIN
    login_records l
ON
    u.id = l.user_id
GROUP BY
    u.id