Dynamic percentile calculation using parameter which will remove top and bottom outliers

Xclipse · August 4, 2024, 12:32pm

Welcome to the community! Thanks for your question!

In Amazon QuickSight, dealing with percentile calculations and filtering outliers while avoiding mismatched aggregation errors can be challenging, especially when working with a pivot table with hierarchical data. Let’s break down the solution step by step.

Step 1: Calculate Percentiles

First, let’s calculate the percentiles for both top and bottom outliers using percentileDiscOver. This approach will ensure the correct aggregation within the hierarchy.

Bottom Outliers Percentile Calculation: bottom_percentile = percentileDiscOver(measure, ${Param1}, [${visualDimensions}])
Top Outliers Percentile Calculation: top_percentile = percentileDiscOver(measure, ${param2}, [${visualDimensions}])

Step 2: Filter Condition

Next, we need to filter the data based on the calculated percentiles without causing aggregation issues.

Filter Condition Calculation: Create a calculated field for the filter condition: filter_condition = ifelse(measure > {top_percentile} AND measure < {bottom_percentile}, 1, 0)

Step 3: Apply Filter

To apply this filter to your data, follow these steps:

Add a new calculated field to your dataset with the above filter condition logic.
Use this calculated field to filter your pivot table:

Go to the Filters pane.
Add a filter for the newly created “filter_condition” field.
Set the filter to include only the rows where “filter_condition” equals 1.

Step 4: Adjusting for Hierarchical Data

If you have a pivot table with a hierarchy and you notice constant percentile values following across the table, you may need to ensure that your percentile calculations respect the hierarchy levels.

Ensure Correct Aggregation Level: When creating the calculated fields for percentiles, make sure you include all necessary hierarchical dimensions in the [${visualDimensions}] parameter.

Example:

Here is an example putting it all together:

Percentile Calculation:

bottom_percentile = percentileDiscOver(measure, ${Param1}, [dimension1, dimension2, dimension3])
top_percentile = percentileDiscOver(measure, ${param2}, [dimension1, dimension2, dimension3])

Filter Condition:

filter_condition = ifelse(measure > top_percentile AND measure < bottom_percentile, 1, 0)

Apply Filter in Pivot Table:

Create a calculated field for “filter_condition”.
Add a filter in the analysis to include only rows where “filter_condition” equals 1.

By following these steps, you should be able to dynamically calculate percentiles and filter outliers in Amazon QuickSight without running into mismatched aggregation errors and ensuring correct values across your hierarchical pivot table.