Data missing from a full join

I’m working with a dataset (dataset 1) that contains 2022 and 2023 data (1000+ row), I did a full join with another data set (dataset 2) using a “week” as a key to join them. There is data for 2023 week 2 for dataset 1, but there is no data for week 2 for dataset 2, so in the visual I created, 2023 week 2 in dataset 1 did not show up. Isn’t a full join suppose to include week 2 although there is no corresponding value to match in dataset 2? Ultimately, I want 2023 week 2 line graph to show up as: dataset 1 show the 2023 week 2 point, and make dataset 2 show as 0 on 2023 week 2 point.

Can you create a table visual with “2023 week” and “2022 week” side by side to see if there’s any pattern in the missing data?

I’ve done full outer joins with date keys (integers) like that before and didn’t have issues.

Hi David. Yes I have, no missing data from 2022. 2023 week 2 is the only week that doesn’t even exist on the x-axis.

Raw data: image

data preview within quicksight (after I full joined with another table that doesn’t have week 2 data):

Isn’t full join suppose to show all data from both tables regardless if it’s matching or not? Why is the 2023 week 2 data not there when I imported?

It works fine for me.
Dataset1 has weeks 1, 2 and 3.
Dataset2 has weeks 1, 3 and 4.

Here’s the result of my full outer join.
image

Hi David. I have double confirmed that I did a full join, since there are only 4 join options (inner, left, right, and full). Seems like you are a QS expert… can we possibly chat offline? my alias is @roglinr.

Hi @Roger,
In data preparation , the preview retrieves first few records , so you might not see week2 . Once save and publish the dataset, can you test the following

My example dataset is as follows : Full join with join on run_date .

1/Build 2 visuals (columns from both dataset ) and get a count of records
2/Add a table and validate join by checking for null values or by filtering the data.
3/In my example : Total count of records (join1+join2)

Finally I don’t know what is your datasource. If you are using SPICE , validate that all records have been imported.

Kind regards,
Koushik

Hi @Koushik_Muthanna,

I am not exactly sure where I could get a column called “run_date”, but I did attempt to test by uploading only the 2023 data from the two datasets, and I made the 2023 week 2 data show up on my visualization and the data preview.

However, when I went back to my original dashboard, made sure the 2023 week 2 data is in there, including my 2022 data as well, I still cannot have 2023 week 2 data show up in the bigger full join dataset.

Hi @Roger Thx for posting, sorry it seems you weren’t able to solve your issue here. You can schedule time with me and we can troubleshoot on a call. My alias is @marakagi

Thanks.

1 Like

Hi @Roger,
Here is the summary from our today’s call: Due to the filters that you have applied at the dataset level, 2023 Week 2 data are being filtered out from the final dataset (after the full join). As we saw, if you remove those dataset level filter, you will get your desired result.

Thanks.

2 Likes

Thanks for posting the summary here @imaitri! :slight_smile:

Thank you for sharing the summary… Good to see filter is the blocker here :slight_smile: