Finding solutions for mismatched aggregation errors in QuickSight calculations

Finding solutions for mismatched aggregation errors in QuickSight calculations

Mismatched aggregation errors are common QuickSight calculation errors reported by users. This error occurs when the calculated field contains both aggregated and non-aggregated (row-level) values. For users new to QuickSight, it’s not always apparent what causes this error. It is important to understand what is aggregated vs non-aggregated from a QuickSight perspective. This article explores common cases where these errors are found, why they occur in the given situation and how to successfully create the desired calculation.

The examples in this article utilize the dataset used in the QuickSight Author Workshop: SAAS-sales.csv. (You can download the dataset and complete the examples shown below.) This data set represents sales data from a fictitious Software as a Service (SaaS) company that sells sales and marketing software to other businesses. Each row of data is one transaction/order. A number of attributes are included in the file such as industry, segment, region, subregion, country and product.

Common Case - #1: Mismatched aggregation error using ifelse()

The goal of this example is to create a calculated field that has the sum of sales for each customer for a month after their initial order. The following calculations are assumed to be present in the analysis. Level-aware calculations allow the level of granularity for the computation or aggregation to be specified. Please see this documentation for more detail on level-aware calculations.

First Transaction Date = minOver({Order Date}, [Customer], PRE_AGG)
First Trans Month End = addDateTime(1,'MM',{First Transaction Date})

A first attempt at creating the calculation is shown below.

Sales First Month as Customer =
ifelse({Order Date}>{First Transaction Date} and {Order Date} <={First Trans Month End}, sum({Sales}),0)

However, this expression returns a mismatched aggregation error because the ifelse() is evaluating at row level (not aggregated) and you are asking the result to be aggregated (sum across rows of data). By moving the ifelse() inside the sum aggregation, the ifelse() evaluation takes place at the row level and then the result is aggregated using the sum() function.

raw data sales prior to creating calc sales month after

The screenshot highlights in green the rows that meet the Sales First Month as Customer criteria for AnyCompany Insurance. The first step is to evaluate OrderDate for the first month date range. Secondly, for the records that return true, sum the Sales value. Therefore, you need to place the ifelse() inside the sum function.

For AnyCompany Insurance, the correct Sales First Month as Customer value of $3873.64.

The correct way to write the expression is:

Sales First Month as Customer =
sum(ifelse({Order Date}>{First Transaction Date} and {Order Date} <={First Trans Month End}, {Sales},0))

The screenshot shows the Sales First Month after Customer calculation. The result shows many customers do not make purchases in the 30 days after their initial order.

Common Case - #2 – Mismatched Aggregation Error in Division when using LAC

Now, it’s time to explore profit. Calculate the profit % of total sales for a given row of data. You need to take the sum of profit on a given row divided by the overall total sales.

A first attempt at creating the calculation is shown below:

Profit % of Total Sales = sum(Profit)/sum({Sales},[])

This results in a mismatched aggregation error again. When you put the square brackets inside the aggregation it turns it into a level-aware calculation (LAC), which results in a field that is not aggregated from a QuickSight calculation perspective. Therefore, you need aggregate the denominator again to match the numerator. For more information on level-aware calculations, please see this article.

The correct way to write the expression is:

Profit % of Total Sales = sum(Profit)/sum(sum({Sales},[]))

In the screenshot, the correct Profit % calculation is displayed based upon total overall sales.

Common Case #3 – Mismatched Aggregation Error using Dimension Fields in Expression

The result of this example will evaluate an expression to use a different aggregated calculation based upon a dimension in the data. The following calculations are assumed to be present in the analysis.

AMER Sales % Enterprise = sumIf(Sales, Region='AMER' and Segment='Enterprise')/sumIf(Sales,Segment='Enterprise')

APJ Sales % Enterprise = sumIf(Sales, Region='APJ' and Segment='Enterprise')/sumIf(Sales,Segment='Enterprise')

EMEA Sales % Enterprise = sumIf(Sales, Region='EMEA' and Segment='Enterprise')/sumIf(Sales,Segment='Enterprise')

Next, use the appropriate % Enterprise calculation in the expression based upon a particular industry. If industry is Communications you want to see the % Enterprise calculation for APJ, all others you want to see the AMER calculation.

A first attempt at creating the calculation is shown below:

Industry % Enterprise Region= ifelse({Industry}='Communications',{APJ Sales % Enterprise},{AMER Sales % Enterprise})

But, the mismatched aggregation error occurs because both the calculations used in the expression are aggregated and you are trying to evaluate industry at the row level (not aggregated). In order to workaround this, one option is to create a calculated field that can be used as a measure for the industry dimension.

Industry Measure = ifelse(Industry='Communications',1,Industry='Consumer Products',2,Industry='Finance',3,Industry='Healthcare',4,5)

Now, use Industry Measure in the Industry % Enterprise Region expression for evaluation and alleviate the mismatched aggregation error.

Industry % Enterprise Region= ifelse(min({Industry Measure})=1,{APJ Sales % Enterprise},{AMER Sales % Enterprise})

In the screenshot above, the table on the right shows the Industry % Enterprise Region value for Communications is .09 (the value for APJ). The other values represent AMER as directed by the calculation expression.

Conclusion

This article explored common cases for mismatched aggregation errors in QuickSight calculations, why they occur and the solutions. By understanding these common cases, QuickSight analysts should be better equipped to successfully create complex calculations in their environment.

Author Kellie Burton

Article Disclaimer: Any code shared in Community articles is to be considered a sample and is not endorsed by AWS.

1 Like