Dataset Join with Blanks

Hi, I am trying to perform a left join between two datasets where the cells in the joining clauses contain blanks. The join is not matching the blanks from one dataset to the other dataset so the those are coming back as null. The join does work when none of the cells in joining clauses contain blanks. Since we cannot join on calculated fields, I do not have the option to use formulas to make the cells contain “[BLANK]” to help with the joins.

Dataset 1: custom SQL from a database, main dataset, both joining clauses can contain blanks
Dataset 2: uploaded excel file, joins to dataset 1 on Distributor Retailer Partner and Transaction type columns to bring in the Sales Type column.

Joining clauses: Distributor Retailer Partner & Transaction Type

Below is a picture of dataset 2:

The join works for when Distributor Retailer Partner is not blank but when Distributor Retailer Partner is blank in both datasets and the Transaction Type is the same in both datasets, it will not match. Below is a preview of the data showing an example of the join with the fields coming in null due to distributor retailer partner being blank (note that ECOM/IN-STORE is a row in the second dataset with a blank for the distributor retailer partner).

Hi @jgalli,

From what I can tell, it is treating the blank field as a NULL so there is no value for it to match between the 2 datasets. This will cause the value from the right table to return NULL values rather than the desired outcome.

What if instead of returning blanks in the dataset, you did a case when statement to return some alternate string value. Your previous suggestion of “[BLANK]” or something like “N/A” may work. With an actual string value to compare against, I think the join would link the rows.

Ideally though, you should join based on unique identifiers that always contain a value if you want the best results. I would recommend this suggestion as a work-around but not necessarily as a best practice alternative.

Thanks Brett.

From my understanding and testing, it seems like QuickSight joins will not recognize nulls/blanks in both datasets as a match. Not sure if that is true.

We cannot join on a calculated field so we can’t replace the nulls/blanks with a placeholder (“[BLANK]”). We could potentially create that column in the custom SQL that brings in the dataset to QuickSight but the custom SQL is already complex, so I was trying to avoid that. I also don’t have the capacity to change the underlying database to make sure the joining fields had unique identifiers.

Open to alternative ideas or a way to make the join recognize the nulls/blanks in each dataset as a match.

Hi @jgalli,
Sorry I did not see that there was a response to this. Are you still working on this or were you able to find a work around? Unfortunately, I’m not familiar with another work around that could potentially work for this case outside the of using custom SQL.

If we do not hear back within the next 3 business days, I’ll close out this topic.

Thank you!

Hi @jgalli - Is it possible to filter the blanks at data set level and do the joining. This may help you.

Regards - Sanjeeb

Hi @jgalli,
Since we haven’t heard back, I’ll go ahead and close out this topic. However, if you have any additional questions, feel free to create a new post in the community and link this discussion for relevant information if needed.

Thank you!