Import failing

I have an S3 folder with data in it, and am accessing it in quicksight using Athena. It is a large amount of data, and I keep getting this SQL timeout message.

"SQL_EXCEPTION Learn More

This is a general SQL error. This can be caused by query timeouts, resource constraints, unexpected DDL alterations before or during a query, and other database errors. Check your database settings and your query, and try again.

Error Details: Query timeout"

I’m generating the data using AWS glue, and importing it into S3. How can I go about getting this data into quicksight to avoid this error?

Hi @ineedqshelp

Partition Your Data: Ensure your Glue ETL job is partitioning the data effectively (e.g., by date, category, or another relevant key). You can check your partitions using:

SHOW PARTITIONS your_table_name;

Use Parquet/ORC Instead of CSV/JSON: Parquet and ORC are columnar formats that speed up queries. Convert your Glue job output to Parquet.

Please refer to the below documentations this might be helpful for you.

Thank you for the quick response, I’ll look at this and give an update.

Does the compression type matter, is snappy okay?

Hi @ineedqshelp

Yes, you can enable compression. Use Snappy or Zstd compression.

Snappy and Zstd (Zstandard) are commonly used compression formats for optimizing query performance in Amazon Athena.

Compression reduces the size of the data stored in S3, which helps Athena read less data, improving query performance and reducing costs.

Snappy: Fast compression/decompression, widely used with Parquet and ORC.
Zstd (Zstandard): Higher compression ratio than Snappy, but slightly slower decompression. Recommended for large datasets.

Please refer to the below documentation this might be helpful for you.

I tried using snappy with Parquet and snappy, and I’m still getting the import failed message, so I will try with zstd.