Boxplot , calculation of outliers in Boxplot

Hello all,

we are creating Boxplots in Quicksight, and it is possible also to visualizes “Outliers” at the Boxplots.
But when I want to calculate the BOXplots (usually per definition MIN/MAX of 1,5* IQR), the calculated values are not the same like visualized in Boxplot.

There is no information existing, how the outliers are calculated in Boxplot .

Can someone help me regarding this information ?



Hi @Turbat1,

Understanding Boxplot Components in QuickSight:

  1. Median (Q2): The middle value of the dataset.
  2. First Quartile (Q1): The median of the lower half of the dataset (25th percentile).
  3. Third Quartile (Q3): The median of the upper half of the dataset (75th percentile).
  4. Interquartile Range (IQR): The range between Q1 and Q3 (IQR = Q3 - Q1).

Outlier Calculation:

Outliers in a boxplot are generally defined as data points that fall below Q1 - 1.5 * IQR or above Q3 + 1.5 * IQR. This means:

  • Lower Outlier Threshold: 𝑄1−1.5×𝐼𝑄𝑅Q1−1.5×IQR
  • Upper Outlier Threshold: 𝑄3+1.5×𝐼𝑄𝑅Q3+1.5×IQR

Discrepancy in Visualization and Calculation:

If you notice discrepancies between the calculated values and the visualized values in QuickSight, it could be due to several factors:

  1. Data Preparation: Ensure that the data used for manual calculations and the data used in QuickSight are identical. Any preprocessing steps might affect the results.
  2. Aggregation and Filters: QuickSight might apply certain filters or aggregations to the data before rendering the boxplot. Verify that the same filters and aggregations are used in your calculations.
  3. Outlier Handling: QuickSight might use a slightly different method or additional criteria for determining outliers, which might not be explicitly documented. This could include handling tied values or specific rules for small datasets.

Steps to Verify and Align Calculations:

  1. Check Data Consistency: Ensure the dataset in QuickSight matches the dataset used for manual calculations.
  2. Recreate Calculations in QuickSight:
  • Use calculated fields in QuickSight to compute Q1, Q3, IQR, and outlier thresholds.
  • Compare these calculated fields with the visualized boxplot values.
  1. Calculated Field Example:
  • To compute Q1:
    percentile({your_field}, 25)
  • To compute Q3:
    percentile({your_field}, 75)
  • To compute IQR:
    percentile({your_field}, 75) - percentile({your_field}, 25)
  • To compute lower and upper outlier thresholds:
    Q1 - 1.5 * IQR
    Q3 + 1.5 * IQR
  1. Validation:
  • Compare these computed values against the boxplot visualization to identify any differences.
  • Ensure that your calculations take into account the same dataset transformations as QuickSight (e.g., filtering, grouping).

Additional Considerations:

  • QuickSight Documentation: Refer to QuickSight’s documentation and support resources for any specific details on their boxplot implementation.
  • Support: If discrepancies persist, consider reaching out to AWS Support. I would recommend filing a case with them where they can dive into the details and help you further. Here are the steps to open a support case. If your company has someone who manages your AWS account, you might not have direct access to AWS Support and will need to raise an internal ticket to your IT team or whomever manages your AWS account. They should be able to open an AWS Support case on your behalf. Hope this helps!

Did this solution work for you? I am marking this reply as, “Solution,” but let us know if this is not resolved. Thanks for posting your questions on the QuickSight Community! Here is also another good explanation from another post.