SPICE vs Athena cost considerations

Trying to optimize the cost for a qs solution
Need to create some complex datasets using certain input datasets

Option 1: Create the output dataset using direct query against athena datasource.
Pros:

  • Can easily write complex queries with aggregations and dataset will have entire business logic
    Cons:
  • Athena costs based on queries.
  • If input datasets need to be in SPICE anyways, duplicated data.

Option 2: Bring in input datasets into SPICE and join spice datasets to create output dataset.
Pros:

  • No additional cost since input datasets already in spice.
    Cons:
  • Due to limitations with SPICE joins, will need additional processing for enriching data in analysis.

Anything else Iā€™m missing here? Also what else should I keep in mind to get a more accurate picture of costs?

Hi @tlm234 - Option -1 is good one and only thing, you can select the require columns that is require for your analysis and store the desired things in SPICE so that you can maximize SPICE utilization. Also if the same data sets is used in various analysis, then you will hit SPICE always not go to Athena every time, this way you can save athena cost.

Regards - Sanjeeb

1 Like