SPICE incremental refresh on int partition column in Athena

Is there anyway to get the incremental refresh for SPICE to make use of an integer partition column rather than have to use a date column which could trigger a full table scan?

We have multiple tables defined in Athena which are partitioned by a column of datatype int (which is derived from the date in the format YYYYMMDD).

I don’t remember the exact reasons we used an int for the partition column, I know the Athena documentation states the following:

“For performance reasons, it is preferable to use STRING as the data type for partition keys. Even though Athena recognizes partition values in the format YYYY-MM-DD as dates when you use the DATE type, this can lead to poor performance. For this reason, we recommend that you use the STRING data type for partition keys instead.”.

And I think there was some other tool such as Impala or Hive that had recommended int.

The current tables work well for us and have a few years worth of data.

We are new to QuickSight so this wasn’t an issue for us before.

1 Like

Hello @NiallF, welcome to the QuickSight community!

Unfortunately, with the current functionality available in QuickSight in regards to refreshing datasets stored in SPICE, the only way to run an incremental refresh is by utilizing a date field. Would it be possible for you to add that field type into your Athena table? If so, you should be able to set up incremental refreshes accordingly.

Let me know if this helps!

Hi Dylan, thanks for your reply.
I might have found a solution that works for me.
I now created the Athena table but specified the partition column as a string instead of an int (the values are not in the correct format to specify it as a date).
Then in QuickSight, after creating the dataset, I changed the partition column datatype in QuickSight to be a date.
QuickSight gave an error/warning that the value was not in a standard supported date format, but it gave me the option to specify the format, for which I entered YYYYMMDD.
This then appeared to work correctly- the values in the preview are correct and I was able to use the column in the settings for the incremental refresh.
I will do some more testing/verification on it.

(As a side note, when I tried to change the datatype in QuickSight to date back when the under lying Athena table has it set to int, the preview displayed the wrong date values so I guess it was assuming it was a unix timestamp or some thing.)

Thanks again for your reply.

1 Like

Hello @NiallF, that is great! I am glad you were able to fix this issue. I will mark your response as the solution, but if you run into a problem while testing, please let me know. Thank you!