Full Outer Join Bug

Hi team,

Full outer join not working properly for adding data manually via file.

I am adding smaller data for one metrics in S3 data with full outer join. It seems data is missing some values even after having join at a same granular level.

First I thought it may be due to coalesce is not applied on previously joined columns hence I have created new columns with coalesce for both dataset columns. Still getting same values.

Let me know if I’m missing any step for full outer join.

Thanks
Nitesh (nichhabr@amazon.com)

@Nitesh1 ,

A full outer join would return all rows from both the tables. The dataset only shows a preview , your analysis should show you all records. You can validate the count of both of tables and check if it matches data in SPICE . A few screenshots would be helpful .

Kind regards,
Koushik

Hi Koushik,
SS_1_VM

I am validating numbers in Analysis only.
I have attached screenshot for your reference.
In first screenshot1 you can see total Sellable yield in csv is 48M, while in dashboard is 45 M (ss2). On deep diving at ASIN level, I found some asins numbers are missing in dashboard(screenshot3).

ss_2_vm

thanks for the details @Nitesh1 . When the data ingestion was done on SPICE , does the row count in your excel match the row count in SPICE ?

row count increases in main dataset for full outer join, from 67 lakh to 68 lakh. In excel there are 16 lakh rows. If possible can we connect on chime Koushik. My id is @nichhabr

here is an example where I have explained full join output and how you can test. This should help test it on our data : Data missing from a full join - #7 by Roger