Table Relationship and Data Modeling

Hey,

I am new to the QuickSight development environment and am currently trying to figure out how I prepare my data in terms of modeling.

I saw that there is an option to connect tables in the data set, but when we were in a QuickSight workshop at the AWS offices, they said that a flat table is better.

I come from a power bi background and the modeling there works with dim and fact tables in a star schema.

Thank you for your help in understanding the issue.

Hi @ChenG89,

Welcome to the QuickSight community and thanks for your question.

You can join datasets in QuickSight or you can join at source, which approach you take can depend on your datasources, for example this community article discusses the use-case for using Athena as a datasource to access S3. What kind of datasources are you using?

In this example, you can also specify the joins in QuickSight using the Direct Query mode with a custom SQL query.

Let me know if these references help.

Many Thanks,
Andrew

Hi @ChenG89

Welcome to the Community!

I had the same questions when I started with QuickSight as I had worked on Power BI before.

If QuickSight you need to add tables and use joins similar like in a relational database.

It works differently when compared to Power BI where you define the One-to-Many, Many-to-One sort of relationship.

There have been feature requests to allow the logical definition as in Power BI. If you were to build a dataset like a star-schema model then you need to have left-outer joins with your dimensions on the left side of the join with the fact table. The dataset size will be large due to the outer joins. So you need to create your dataset keeping this in mind.

Typically you may want your time dimension to be present even when there is no data in the fact.

Here are some resources for a high level overview of SPICE vs Direct-Query datasets. Hope they help you determine how to setup datasets in QuickSight

Regards,
Giri

Hey @abacon,

Thanks for the quick reply.

My current flow is S3 > Glue > Athena > QuickSight.
Assuming this is the correct route in terms of architecture, this is currently the situation.

In terms of my sources of information, at this point for POC they are Excel files.
Later on the sources of information will be the operational systems in API connection or directly to the DB.

In the end, I want to learn to work correctly in the AWS work environment and go in the direction of best practice.

Thank you,
Chen.

Hey @Giridhar.Prabhu ,

Thanks for the quick reply.

According to this link joining-data I don’t see that there is a left outer join.

I was thinking of developing according to star schema since this is what I know from before, but I would love to hear what is the most correct way.

I’m still preparing and organizing my data and therefore trying to figure out the right way to do it.

Thank you,
Chen.

Hi @ChenG89,

Welcome to the QuickSight Community!

I wanted to share some insights on how joins work in QuickSight compared to Power BI, as this can be quite different and may affect your data modeling approach.

Data Joins in QuickSight vs. Power BI

  • In traditional star-schema modeling, such as in Power BI, the relationships between tables are defined to help organize and analyze data while keeping the original row counts intact. For instance, if you have a dimension table called departments (with 10 rows) and a fact table called dept_Students (with 20 rows), a 1-to-many join in Power BI maintains each table’s row count. So, counting department[id] would yield 10.
  • On the other hand, QuickSight simplifies the data modeling process. When you join tables in QuickSight, it creates a single dataset by combining data from different sources. This means the resulting dataset’s row count reflects the relationship between the tables. For example, if you perform a left join on the departments and dept_Students tables, you would see a combined count, counting department[id] would yield 20.

This approach offers flexibility and speed, making QuickSight an effective tool for visual analytics. While it may not provide the same level of logical data modeling as Power BI or Tableau, QuickSight’s ability to create unified datasets quickly can be advantageous for generating insights and dashboards efficiently.

Key Considerations:

It’s important to keep in mind that the resulting dataset may be non-normalized, so be cautious with calculations to avoid issues such as repeated rows in dimension tables.

Regards,
Zerry

Hi @ChenG89

It’s been a while since we last heard from you. If you have any further questions, please let us know how we can assist you.

If we don’t hear back within the next 3 business days, we’ll proceed with close/archive this topic.

Thank you!

Hi @ChenG89

Since we have not heard back from you, I’ll go ahead and close/archive this topic. However, if you have any additional questions, feel free to create a new topic in the community and link this discussion for relevant information.

Thank you!