Using Dataset A to filter Dataset B

jackattack6800 · March 15, 2024, 12:35pm

I have two datasets, 1 with a few hundred rows (dataset A), 1 with 49+ million rows (dataset B). My goal is to create a new dataset where these two datasets are joined on a common field.

Dataset B is from Athena query and in SPICE. Any sort of work to join on this dataset has failed due to timeout.

My thought was to create a filter initially to minimize dataset B prior to the join.

Any suggestions on how to do this?

ErikG · March 15, 2024, 12:42pm

Hi @jackattack6800
i guess dataset 1 isnt a Athena, right?
What join option do you use?
You could try you set a filter on B, do the join, and remove the filter on B.
BR

ErikG · March 31, 2024, 7:29pm

Hi @jackattack6800
any updates on your side?
BR

jackattack6800 · April 1, 2024, 3:57pm

The joins fail due to timeout.