DocumentDBConnector schema inference and Glue tables with camelCase column names

anunez-nexben · April 6, 2023, 4:31pm

I was able to use the AthenaDocumentDBConnector successfully to visualize MongoDB data in Quicksight.

However, in some cases, the connector’s schema inference is not able to find all the existing columns. I believe it’s because it only scans 10 documents.

So my next step was to supplement the schema with a Glue Table, since the connector says it supports that.

The problem I have now, is that I’m struggling to make it work with the camelCase field names in our MongoDB. Glue seems to prefer all lowercase column names. So in almost every way I create the Glue table, all column names get converted to lowercase. For example, if one of the field names is updatedAt, Glue converts it to updatedat. And ultimately, when trying to execute queries, it results in “Column not found” kind of errors or it just doesn’t pull in the data.

Glue does respect the camelCase if I create the Table by using an existing schema from the Glue Schema Registry. And the Glue Table looks great. But querying through Athena or QuickSight results in “No columns found” kind of errors.

Sorry that was a bit long winded - it’s been an interesting adventure. Any suggestions or anything I might be missing to make this general setup work? At this point, my next step would be to pivot to a heavier setup with ETL jobs, transforms, and duplicating data to S3 or something.

Sanjeeb2022 · April 6, 2023, 6:14pm

Hi @anunez-nexben - First of all “Welcome to AWS QuickSight Community” and this is a very interesting question for me No SQL DB data is always a painful in terms of etl and presenting the data in Visualization tool. The issue is all about Glue is not able to identify the column name due to camecase ( or it is case sensitive). If you are able to query the data via Athena ( using Glue catalog), i believe the issue will be resolved from QuickSight level as well.

Can you please submit a AWS case in Glue category ( to raise a ticket to AWS use the link-Creating support cases and case management - AWS Support) and Glue experts can help you in fixing the issue. If you do not have direct access to raise a ticket, you can request your aws admin to submit a ticket for you.

In worst case if the above issue is not fixed, you are correct, you have to unload the data to S3, create an athena table and point athena table to quicksight.

Tagging QuickSight experts for their feedback.
@Max @Kristin @Bhasi_Mehta @thomask @Thomas @Tatyana_Yakushev @Biswajit_1993 @David_Wong

Regards - San

anunez-nexben · April 11, 2023, 4:29am

Hi @Sanjeeb2022, thank you for the reply. I’ll try to connect with the AWS Glue team and see what I can learn.