Quicksight Joins do not work

Hi, I am creating a new dataset by joining one csv file with another existing data source that is a table from a redshift cluster. The location file has location ID, and the custom SQL has location details, and I created a left join on the location ID.


However, the join did not quite work. Here is what I mean:
Although location id 1 and 3 exist in my SQL table, it does not bring in its value. My new dataset looks like the graph below. However, if I search for location_id in (‘1’, ‘3’) in my SQL table, I can find the row.
image

If I modified my SQL table to only have location id 1 and 3, and left joined to location table, it will actually show location description for 1 and 3.
image

Therefore, the join succeeded this time when I limited the location Id, but I really do not understand why it did not work when the whole table is joined. also, both location ID’s data type are string.

Hi @yuchen1,

Can you try this workaround? Instead of joining a custom SQL query directly to your Location dataset, can you use your SQL query to create a separate dataset? Then, join the two datasets using Location ID. Let me know if you get different results.

Hi David, thanks for replying! I tried it, but i do not see any difference in the result. I wonder if it could be the number of rows my SQL is bringing in. It has 10K records, and I wonder if the table could have been truncated. I am not aware of any row number limit. Is there any?

The row limit is way higher than 10k. Can you use SPICE for the 2nd dataset (the one with missing records) and see if the refresh gives you any errors about skipped rows?

Hi @yuchen1 - I agree with @David_Wong suggestion, can we make the custom sql as SPICE so that data can be bring from Redshift and then do the joining. Or another approach, put the csv file in S3 and create an external table in Redshift and do the joining via custom sql . However if the csv file is user driven, then this may not be a good approach.

Regards - Sanjeeb

Hi @yuchen1

It is quite odd making - have you tried an inner join to see if there is different behavior?
Is your fact on the left and dimension on the right?

Thanks

1 Like

Hi @yuchen1 ,

Are you checking the join results in data set preview screen or in an analysis after publishing the dataset?

The preview within dataset isn’t always accurate, when working with multiple data sources, as it is put together based on partial data pulled from these sources. If your data in both tables, the keys and join conditions are all accurate, try publishing and check from analysis layer.

Marking this as solution for now.
In case this doesn’t help, please respond here and you can remove the solution flag to keep the discussion going.

Regards,
Arun Santhosh
Pr QuickSight SA

1 Like

Thank you for letting us know. I would recommend filing a case with AWS Support where we can dive into the details so that we can help you further. Here are the steps to open a support case. If your company has someone who manages your AWS account, you might not have direct access to AWS Support and will need to raise an internal ticket to your IT team or whomever manages your AWS account. They should be able to open an AWS Support case on your behalf. Hope this helps!