Resolving Issues with Scientific Notation in QuickSight When Introducing Glue and Athena as Intermediaries

I’m facing a challenging issue with our AWS data pipeline that affects data visualization in Amazon QuickSight. Initially, our setup involved directly importing CSV data from S3 into QuickSight, where the data was treated as strings and then transformed into DECIMAL for analysis. This configuration handled scientific notation (e.g., 4.0954924E7) correctly, displaying the data accurately in our visualizations.

Recently, we revised our pipeline to incorporate AWS Glue and Amazon Athena between S3 and QuickSight. Here’s the current flow:

    CSV files are stored in S3.
    Data is ingested into a Glue Table with columns set as strings.
    Data is queried through an Athena view, maintaining the string data type.
    Data is then loaded into QuickSight, where it is transformed from string to DECIMAL using the same conversion method as our previous setup.

Post-modification, we are experiencing issues where QuickSight fails to convert values in scientific notation correctly from the Athena output, resulting in these values being displayed as null in the visualizations. This problem did not occur when QuickSight directly accessed data from S3.

Has anyone else encountered this issue when introducing Glue and Athena as intermediary data processing stages? I’m looking for insights or suggestions on configurations or adjustments needed in Glue, Athena, or QuickSight to handle scientific notation as smoothly as in the direct S3 data source setup.

Thank you for any advice or guidance you can provide!

1 Like

Hello @m0ltar, is the string literally formatted like this 4.0954924E7 or is it 40954924? If it is the former, QuickSight likely can’t determine the value out of the box, and you would probably need to calculate the value and convert it within Athena. I am assuming the E7 portion is where QuickSight is throwing the error and doesn’t know to convert it to a decimal.

The string is literal that, yes.

And my point is that this string is being interpreted by QS correctly when the data source is CSV from S3. The scientific notation is being evaluated correctly and the full number is represented.

But when using Athena data source, all else being equal, the same format no longer works. And Athena reads and returns the same CSV files as strings too. We do all recasting at the transformation phase in the dataset.

Hello @m0ltar, that is really strange. I know QuickSight handles the ingestion of the 2 datasources differently, but there must be something happening between QS and Athena causing the error. Are you opposed to handling the transformation for the scientific notation within Athena? I found a Stack Overflow page that has information on converting it there. That might be your best bet because this seems like a weird limitation on the data ingestion phase.

Thank you for letting us know. I would recommend filing a case with AWS Support where we can dive into the details so that we can help you further. Here are the steps to open a support case. If your company has someone who manages your AWS account, you might not have direct access to AWS Support and will need to raise an internal ticket to your IT team or whomever manages your AWS account. They should be able to open an AWS Support case on your behalf. Hope this helps!

Hey @m0ltar I have opened this back up. Please feel free to add any comments to @DylanM’s post. Thanks!

Hello @m0ltar, I wanted to check in. I know you mentioned you had further questions on this. I am happy to assist further with more information.

1 Like

Yes, sorry for the delay. Had to switch context to something else.

Are you opposed to handling the transformation for the scientific notation within Athena?

Yes, unfortunately, this is not possible for us at the moment because we programmatically generate the whole data pipeline. The data comes from a third-party system (owned by Amazon.com), which is always in CSV. We don’t use Athena for any complex queries, mostly to just compact the data and union all of the CSVs.

Changing that to do transformations would be a big lift.

1 Like

Hello @m0ltar, I understand. I genuinely don’t think there is an good alternative option in regards to doing the task within QuickSight rather than Athena.

I suppose you could send the CSVs directly into QuickSight, and rely on joins within the QuickSight console, which I honestly wouldn’t recommend if this needs to be scalable. Unless there is a process to parition the CSVs in S3 through something like Lambda and Firehose where changes could be made prior to being joined in Athena queries, handling this in Athena is likely the only option.

1 Like

Btw, we have a support ticket open for 5 days already. No reply.


Also, to add more context to this matter, I think it might be a little unclear.

The dataset preview screen shows the numbers correctly, i.e. a correct number, in scientific notation. The data type is set to decimal. All is well there.

But once the dataset is loaded thru the analysis screen, the scientific notation number becomes null.

So, I can’t think of a case where this would not be a QS bug. The fact that the number is visible in the dataset preview means that it has already gone past all connection strategies (JDBC/ODBC) into the QS territory, and it is failing somewhere there.

1 Like

Hello @m0ltar, unfortunately, I have no control over how the support tickets are managed, but I can at least reach out to someone on my end to see if they are able to assist in moving this forward.

I also agree with you, I am pretty confident that this is a bug in QuickSight. If support doesn’t have a solution or a work-around they can implement right away, I believe managing this outside of QuickSight would be necessary if this is a time sensitive requirement.

I have no further ideas on my end regarding this issue. Do you mind if I archive this topic and we can manage the remaining communication on this through direct messages? I can also send you an update there if I get any information regarding your support ticket. Hopefully we can figure out a better solution to this bug. Thanks for your patience and for providing detailed information on the issue!

1 Like

Hello @m0ltar, since we are connected via direct message, I will archive this topic. Keep me posted on updates regarding your support ticket and if you are able to work out a solution. Thank you!