Error in Creating Box Plot with Calculated Field

Hi team,

I was wondering if anyone happened to know why I am getting an error when I try to use a calculated field to create a box plot. The calculated field created in QuickSight is SUM(volume) / SUM(hours). I have posted a screenshot of the error.

Hey @yipk !

Is the volume field a calculated field that you created during data preparation when you loaded your dataset into Quicksight?

If so that might be causing the error. What is the calculation that you ran to create volume?

Hi Duncan, thanks for the response!

Volume is not a calculated field, it is just a column in the data table.

If it helps, I’m posting my SQL code snippet below:

SELECT
a.balance_date
, a.sc
, b.benchmark_type
, a.volume
, a.hours
FROM table_name a
LEFT JOIN table_name2 b ON a.sc = b.sc
WHERE a.category = 'filtering_here'
1 Like

Hey @yipk, the syntax seems correct to me and your field hours looks like its the correct data type, so I’m not sure why it is throwing that error.

Do you have any filters on the box plot chart? Also, there are some limitations to box charts for LAC functions and MYSQL:

Hey @duncan , I don’t have any filters on the box plot chart, so I’m not sure what is causing the issue. I don’t really understand the concept of Level-Aware Calculations (LAC), could you explain this to me?

@yipk , no problem!

A LAC function allows you to run a calculation on your table at a specific level or partition. For example, if you didn’t want total sum of sales but sum of sales by country you could do the following:

sumOver(sum{sales}, [{country}])

*Important note: you have to do a partition by fields in your dataset.

I’m going to assume that those don’t exist in your analysis though. What is your datasource for this dataset?

1 Like

@duncan

I see, yes that is correct. In the backend query I am not doing any partitioning by fields. The datasource of this dataset is the a table that contains volume, hours, benchmark type, and sort center. I don’t believe there is any partitioning that is occurring in that data table.

It is weird that when I modify the query to do the calculated function, QuickSight is able to process it:

SELECT
a.balance_date
, a.sc
, b.benchmark_type
, SUM(a.volume) / NULLIF(SUM(a.hours),0) as TPH_adjusted
FROM table_name a
LEFT JOIN table_name2 b ON a.sc = b.sc
WHERE a.category = 'filtering_here'
GROUP BY 1, 2, 3

Is there any benefit to separating the volume and hours column, and then computing the calculated field of TPH in QuickSight vs. in the query?

1 Like

Hello @yipk !

I’m glad that worked for you! If your hours field was an integer rather than datetime data type then I’m not sure why it didn’t allow you to sum properly.

That being said there are ways to work around those issues with Ifelse statements and nested aggregations. I recommend checking out this post:

Hello everyone, thank you for all this information. My case is a bit different, but I would like to see if you can help me. I have a dataset that includes Year, Product Category, Product, and Value as columns. I want to use a Box Plot to show the total value (Product Category) by Year. I know I could try to use SumOver, but the point here is that the Box Plot does not accept aggregation functions in the metric—it considers the existing value in the column. I would solve this with a View in the database, but that is not an option. I need to resolve this with the dataset that is at the Product level.

image