JOIN using Athena table and RDS table

I have a dataset in Quicksight that uses a join between Athena table and an RDS table. Unfortunately, this seems to force it to a SPICE data store rather than a Direct Query which results in the data not being responsive and accurate in the Visualization.

I am looking for a method where I can create a join in Athena, perhaps a View Table to reduce latency and have a more responsive Visualization. I tried to add my RDS database as a datasource to Athena but it requires a lambda be configured. What alternatives are available to accomplish this joining of the 2 databases which will eliminate latency ?

Hi @alQemay - When you are joining 2 different data sources, QuickSight is always make the refresh as SPICE as it needs to bring the data to memory and then do the joining.

To understand the use case more detail.

  1. What is the frequency of data refresh for your athena and RDS table.
  2. Are you looking for real time reporting?
  3. What is the data volume of each of the sources.

Please provide the details so that we can guide you the efficient solution.

Tagging some of the experts @David_Wong @Koushik_Muthanna @Kellie_Burton for their advise as well.

Regards - Sanjeeb

Hello Sanjeeb,

  1. Frequency of the data refresh should be at least daily.
  2. Would prefer real-time
  3. data volume in minimal < 1000 records

So is Athena even capable of doing joins between difference data sources ? or is it possible to query and RDS table from Athena ?

1 Like

Hi @alQemay - Using federated query approach you can query RDS data from Athena ( link - Query any data source with Amazon Athena’s new federated query | AWS Big Data Blog).

Since the data refresh is happening on daily basis, why you are not refresh the data in SPICE and do the reporting and refresh the SPICE on daily basis. If it is a near real time, SPICE can be refreshed in 15 mins. If you will use capacity planning, the spice can be refresh in 1 min as well.

If your RDS data is refreshed on daily basis, the best way to bring this data and create an athena table and refresh it on daily basis, then you will have both data sets in athena an dusing “direct query” you can do real time reporting.

Regards - Sanjeeb