Merge multiple DB sources in the same dataset?

Our product has multiple DBs with the same schema, e.g. Customer1DB, Customer2DB, etc.

What’s the best way to add multiple DBs to the same dataset?

PS: I posted this on re:Post initially but got nothing.

UPDATE: Just use FULL JOIN. By design, it includes all records from both sources, regardless of the matching values.


One way would be to add the different tables in the same data set and do a full join.

But you can’t have an always true condition like 1=1. The UI only allows selecting existing fields to match them.

Hi,

full join will load the full data matched and unmatched(null in the join key too). Mean if your dataset id match in both the table and in case of id is null of table this join will include those rows too.

Most of the this type of join not required, if your business need you can do it.

The best approach in data prep to chose the left join. A big table in the left side and rest with rights. Again depend on requirement.

you can join multiple data source too.

This video may help you.

regards,
Naveed

This almost works, but the result have a duplicate set of columns for each source table, like this:

So I’m back to where I started :-[

I do need a FULL JOIN, I have multiple DBs and tables with identical structure but different data. Think CustomerDB1, CustomerDB2, CustomerDB3, etc.

Hi,

you can avoid these column in you data set by using the filters. I mean you can control required column using the filter in dataset.

regards,
Naveed Ali

These can’t be excluded, the value from the second set of fields must be present in the main set,
i.e. created_atcreated_at[mc_clearings (2)

Hi @Gabriel_R

Could you try to write custom sql and combine the data from different DB hoping underlying database supports database links / data wrappers .

Currently we cannot combine (UNION) across datasets / datasource objects in Quicksight , best workaround for now is to make use of database links / data wrappers .

Thanks for the suggestion, I’ll pass it over to my Infra / Dev colleagues who know more about this.
Can you please tell me in what system we should “make use of database links / data wrappers”? In a QuickSight SQL dataset, in a separate dedicated DWH-like DB or in each separate DB then in a QuickSight dataset?