CSV file with quotations around the fields

Stephen.Abbott-Capps · March 19, 2024, 8:42am

Hi,

I have some data sources I would like to ingest into quicksight (via glue). They are of csv format but the field data is in quotes.

I have followed Best practices when using Athena with AWS Glue - Amazon Athena however the results are the same in that it recognises some fields as a different data types (e.g. bigint) but it does not recognise others, particularly date fields.

Is there a reliable way to have the system process the files or would it be better to pre-process the files before ingestion to remove the quotation marks.

There will be new files every month so I want it to be as automated as possible. (Data source is from a 3rd party and cannot be changed).

awsvig · March 19, 2024, 9:03am

Hi @Stephen.Abbott-Capps - There are a couple of options to manage the data type of fields.

Perform pre-processing on the ingested CSV file before refreshing quicksight dataset.
Use a lambda function to trigger the data refresh using Update-Dataset API that lets you specify the data type of columns.

Stephen.Abbott-Capps · March 19, 2024, 10:29am

Thank you for your advice, appreciate the feedback.