Possible rounding issue between QS-Athena

Hello,

I created a new dataset in QS, using a view from Athena as data source. What i notice is that the results in Athena, do not match the results of a QS (if a table is created).

Athena query:

select year_string as Year, (sum(coalesce(cast(cost as double),0))) as cost
from “xxxxxx”.“xxxxx”
group by 1
order by 1

cast has been used since the cost in the view is a string

Athena results

2022 = 2.1878670829298444E7
2023 = 3.1750393504131293E8
2024 = 8.339033675704077E7

QS results when creating a table

2022 = €21.878.655,26
2023 = €317.503.774,60
2024 = €83.390.252,64

Number are close but not exact. QS field is decimal float.

Any idea why, and how this can be handled?

Regards,
Nikolas

Hi @nikolasrad ,

It looks like you’re applying some formatting to the number to display it as a currency. How many decimals are you allowing?

To verify the accuracy of the two systems, I would start with the number in the dataset. What do you get when you see the numbers in the dataset? Any differences?

Then look at the aggregation in a table in the visual (the published dataset), but without any formatting and with maximum decimals. Are the numbers the same between systems?

That should convince you that Quicksight is only formatting based on your selections.

Hope that helps.

ws

Hello,

Thanks for the response.

The currency formating is applied for the UI (… → Show as → Currency) and does not really affect the rounding. Even if the decimal number set to custom and max there are still differences that i cannot explain.

below an another example from another level of the data:

Athena

query used with sum aggregation:

select period_string as month, sum((coalesce(cast(co.cost as double),0))) as cost
from “xxx”.“xxx” co
where co.year_string = ‘2024’
group by 1
order by 1

2024-01 = 423977.9101184828
2024-02 = 380954.24447858304
2024-03 = 434978.89085955115
2024-04 = 45261.07349052245

QS table (no formatting,max decimals)

table with values group by Year and field Cost(sum)

2024-01 = 423,977.61930000000
2024-02 = 380,953.97020000000
2024-03 = 434,978.54280000000
2024-04 = 45,260.99430000000

Regards,
Nikolas

Thanks @nikolasrad!

Those are good results and do show a difference! But it’s tough to say that one system is rounding. It looks more like there are missing values.

How much data is in a month? Is it feasible to take a month like 2/24 and display all of the values? That way you can check each value in 3 different spots, 1) Athena tables, 2) Quicksight Dataset and 3) Quicksight visual. And the first thing that I’d check is counts. It almost looks like some values are not being included; in all cases the values in the Quicksight table are LESS than the values from the Athena query.

I’m super curious where this is occurring.

ws

Hello @wstevens0,

Thanks for your responces and your interest to this.

Counts are exctly the same in Athena and QS dataset. I tried to dig deeper and starting checking individual accounts (rows) for a specific month. The weird finding is that there are records that match exactly to the latest decimal (between QS table graph and Athena) and some that diverge and make no sense to be rounding.

For example:

For a specific customer and account, for March 2023:

Athena result = 116375.66701950782
QS table = 116,375.667019508
… acceptable rounding.

For another customer and account, for March 2023:

Athena result = 107512.33690838634
QS table = 107,512.353921840
… not so close rounding.

Any thoughts on this?

Regards,
Nikolas

@nikolasrad - Thanks for posting your query. Can you please tell us if the underlying dataset is a SPICE based dataset or Direct Query Dataset? I know that for SPICE based dataset, QuickSight introduced decimal float datatype at the later part of last year (QuickSight launches FLOAT data type support for SPICE datasets).

Also, can you please tell if the field that you are looking in QuickSight is a direct field coming from the Dataset or it’s a calculated field?

Hello @sagmukhe,

The underlying dataset is spice based and the field is a decimal float type, coming directly from the dataset, not calculated field.

BR
Nikolas

@nikolasrad - Thanks for sharing the details. In that case, as @wstevens01 mentioned, this should not ideally happened given that the datatype is decimal float, which is being used. This anyway allows precision of 16 digits in total. Please try to follow step by step as @wstevens01 suggested and see where it is breaking. If everything looks in order (which seems to be the case), I would request you to open a support case with AWS. Here are the steps to open the ticket → Creating support cases and case management - AWS Support. Hope this helps!

Hi @nikolasrad ,

Now, I’m really curious!! This case definitely looks like a rounding issue, unlike the earlier aggregation that looked more like missing values.

Are you comfortable sharing just those values from the Athena side? No need for any dimensions, just the list of March 2023 values with full precision in a csv file. I’ll load these up and duplicate the issue and then see my peers have any thoughts.

Meanwhile, @sagmukhe is right. This is worth opening a support ticket on. The support team will dig into as well.

thanks!

ws

1 Like