Dataset Dimension - Fact table join

I encountered an issue trying to report an exception in a dashboard.

The fact table has passenger data by reported location in an airport. Location table is the dimension table where I get the location name, whether the location is a Gate location etc. The pax table is joined with the location table on the location_id column using a right outer join.

I want to report locations where the location is a Gate location but did not have any passenger reporting at the location. I have 79 locations which are Gate Locations. Only 65 are available in the dataset, 14 are missing which are the the exceptions I want to reported in the dashboard.

image

How do you typically achieve this in a QuickSight dataset. I have worked with Power BI before and it was a simple setting of a Many to One relationship between the dimension and Fact able that would achieve this.

Any hints/tips would be appreciated.

Regards,
Giri

I assume you mean a left join. You wrote right join in the descriotion but have left join in the picture.

Regardless, you can create a calculated field using isNull(location_id). Then create a table listing reported_location_od and filering in this field being TRUE.

Hi @Giridhar.Prabhu ,

I believe you will have 2 location id field in the dataset, one from the left table and one from the right table. Can you check if you are using location_id from left table or right table?
For your use case, you should use the field from the right table, keeping the right outer join configuration.

Thanks,
Prantika

Hi @Giridhar.Prabhu - Before doing the KPI, please show the data in a tabular form and see whether you are able to see all 79 locations for GT or not, also select the field as suggested by @prantika_sinha .

Regards - Sanjeeb

Thanks @prantika_sinha. I am using the location id from the Location table. I did list the location id in a table form and can only see 65 out of the 79 locations

Can you check by just ingesting location as a dataset only (without any joins), what is the count that appears? the intent is to judge out the ingestion failure of records.

Hi @Andy_Brand the join is right outer with the pax_journey being the left side of the join. My intention is to get the location even though there is no data in pax_journey table.

Since the join does not bring in all the rows from the location table the calculation field would not serve any purpose.

I did that first and have 313 total rows. Out of this 79 rows have the location_display_group_code = GT that I am interested in.

Once I bring in the pax_journey and join the two table the result is 1338 rows. The count distinct of Location_id (from location table) reduces to 98 rows. count disintct of location_id filtered for GT locations reduces to 65

Ok, lets try to debug this way:

  1. create a tabular view which has location_display_group_code as dimension
  2. Add count of location_display_group_code in measure
  3. add count of distinct location_id in measure

Do not apply any filter. Confirm if both the measure values match for the location_display_group_code value ‘GT’

This is the result when I only have Location table in the dataset

image

This is the result once I have a join with the pax_journey table

image

@Giridhar.Prabhu Then just do the opposite. Regardless, you can create a calculated field using isNull(relorted_location_id). Then create a table listing reported_location_id and filering the calculated field to TRUE.

Just to confirm, the location display group code field used in the second visual is also from the right table. Apologies for asking so many questions.

As a last option you may try creating the dataset with location table left join pax_journey. I checked with a sample dataset with right join, and it worked fine for me fine.

Tried both ways; but can’t get the results I need. Is there a way for me to see what the underlying SQL QS is issuing to the database so I can debug the same?

By underlying SQL do you mean sql generated post dataset join? I am afraid that we can check.
But I am curious to know what was the result when you used location display group from right table.

@Giridhar.Prabhu ,

You could also write the SQL in the database to validate the results (the joins ) and then use it as a CustomSQL in QuickSight. If you do the joins via QuickSight, your backend database should log the SQL it received which should allow you to debug further.

Hi Kaushik,

Thanks for following up on this.

I ended up creating the dataset as follows as the filter on the date column on my fact table was causing the outer join to skip some rows

image

By encapsulating the where clause in a Custom SQL I pre-filter data and hence the outer join works now.

I understand the mechanics of the SQL that caused the problem. In the absence of an ETL mechanism, when I want to query tables from a relational source I may end up with a mix of Custom SQL and relational table based datasets. I was aiming for a pure relational table based joins and filters in quicksight without going to Custom SQL which may not be possible; unless I have some options in the quicksight dataset where I can specify the filters a “Pre-Filter” which is apply filter on the underlying relational table before doing any joins and filtering within QuickSight

Regards,
Giri