I was able to use the AthenaDocumentDBConnector successfully to visualize MongoDB data in Quicksight.
However, in some cases, the connector’s schema inference is not able to find all the existing columns. I believe it’s because it only scans 10 documents.
So my next step was to supplement the schema with a Glue Table, since the connector says it supports that.
The problem I have now, is that I’m struggling to make it work with the camelCase field names in our MongoDB. Glue seems to prefer all lowercase column names. So in almost every way I create the Glue table, all column names get converted to lowercase. For example, if one of the field names is updatedAt
, Glue converts it to updatedat
. And ultimately, when trying to execute queries, it results in “Column not found” kind of errors or it just doesn’t pull in the data.
Glue does respect the camelCase if I create the Table by using an existing schema from the Glue Schema Registry. And the Glue Table looks great. But querying through Athena or QuickSight results in “No columns found” kind of errors.
Sorry that was a bit long winded - it’s been an interesting adventure. Any suggestions or anything I might be missing to make this general setup work? At this point, my next step would be to pivot to a heavier setup with ETL jobs, transforms, and duplicating data to S3 or something.