Placing a filter on a visual has the effect of excluding non-matching records in an full outer joined data set

I have a joined data set from two SPICE data sets, which I’ll call DelayedTrips and AllTrips. AllTrips is the larger data set, with a grain of one row per trip. DelayedTrips includes a subset of those same trips, for which one or more events during the trip caused a delay. The grain of this data is one row per event. These two data sets are joined via a single primary key, TransportID.

One of the calculations I need to do is “percent of delayed trips by all trips” which requires counting the unique number of delayed trips as the numerator, and the number of trips overall as the denominator. This works fine until I apply a filter on number of events per delayed trip. The purpose of this filter is to answer questions like, “What is the percentage of trips that specifically had 3 or more events over all trips?”

The challenge is that this filter apparently has the effect of removing all the unmatched rows from the AllTrips side of the data set, and calculating the denominator ONLY on the subset of rows that have a delay event.

I have attempted to use level aware calculations to safeguard my denominator from that change, but it appears that the filter overrides those calculations.

Is this the function as designed for filters, and is there any way to overcome it?

Thanks

With Level Aware Calculations you have the option of using the Aggregate functions or the Windows functions. With the Window functions you can instruct to do the calculation with the PRE_FILTER option, which does all computation of the metric before applying any filter. With the metric already calculated, then the Analysis applies all the filters showing only the results you are seeking.

You can see more about the LAC-W (Level Aware Calculation Window) functions in this link: Using level-aware calculations in Amazon QuickSight - Amazon QuickSight

And you can see how the order of evaluation of PRE_FILTER and PRE_AGG options affect the flow of how calculations are done, in which moment, and where the filters are applied: Order of evaluation in Amazon QuickSight - Amazon QuickSight

1 Like

@SantiagoA I have tried that (as noted in the original description). Essentially I need to resolve the denominator (count of all trips) in order to compare with the numerator (count of delayed trips). Because this data set is joined, and the set of delayed trips is far smaller than the set of all trips, there are a lot of the “all-trips” rows that have null fields in the “delayed trips” columns. So, for the denominator I used the following syntax at your suggestion:

sum(distinctCountOver({Transport Request ID},[{Arrival Time Combined}],PRE_FILTER))

“Transport Request ID” is the unique identifier for “all-trips”.

What happens is that once I apply the filter on the visual that uses this numerator / denominator calculation, the effect is that every trip record that has null “delayed-trips” column values is ignored. Therefore the denominator is now exactly the size of the numerator, so every calculated value is 100%.

Have you see this happen before?

I would be very interested in demoing the issue I’m seeing if someone can be available.

Thanks!

Hi - that sounds great. Please contact me at paul.t.fields@kp.org