Join on sources

Hi,

I would like to have a join between 2 data sources based.
However the “on” fields are of different data type, one is a date type and the other is a string containing timestamp value.

I understood that it is not possible to base the join on calculated field.
I tried to defined the string field as a date one but it returns an error.

How can those 2 datasets be joined?

@Annab
can you convert the date to a string and then join them? (make sure both columns are type string)

@neelay

It won’t help because one has a timestamp values and the other has date values.

@Annab join make tableA.column1 == tableB.column2 in where condition, if your values are not same and not going to be same your join will never work, not in quicksight not in any db.

Depending upon your data-source, you have to use diff approaches to create a new field where that field holds just the date value for a timestamp, and then you can use that for joining.

@neelay This is not accurate, as in SQL I may cast and apply different functions within the join “on” clause.

So, based on your answer, does it mean that the only way for me to overcome this issue is by making the format change on the source data before connecting it to Quicksight?

Hi @Annab
why not bring both strings to the smallest common string?
E.g.
10/10/2023 and 10/10/2023 10:23:30 to 10/10/2023
BR

@ErikG That’s what I was hopping to achieve by using calculated fields (because the formating didn’t work), but it is not applicable on the join fields.

Do you have an alternative suggestion?

What datasource are you using?

Do you have two datasets and would like to join them in e third one or all at once?

@ErikG I have 2 csv files in S3 bucket that are updated daily.
I would like to join the 2 files

But how do you create the dataset?
One for file 1 and one for file 2 and join into dataset 3?
Because that way you could create a calculated field to get the timestamp into string or date in the base dataset and join in the new dataset.

@ErikG I do not have access to source files, they are pushed to the bucket by the source which is out of reach for me.

@Annab The ‘cast’ function is an operation applied to a value prior to matching values. It’s important to note that QuickSight SPICE differs from an RDBMS and may not offer all functionalities present in TSQL. Several solutions have been outlined previously, allowing flexibility to choose the most suitable one for your requirements.

Hi @Annab - Is it possible to share sample data and data types on which you are planning to perform join operation. This will help community replicate the issue and guide you the right solution. The steps provided by @neelay and @ErikG are valid cases and we should use cast or make sure both the field in the QS are of same format before applying the join condition.

Please share sample data so that community can help you.

Regards - Sanjeeb

Can you do the following

  • Create a dataset (A) with the file where you have a date field
  • Create a dataset (B) with the file where you have the timestamp field
  • Create a calculated field in dataset (B) to get a date field based on the timestamp field
  • Create a dataset (C) to join (A) and (B) on the date fields

To sum up my understanding:

I need to change the join fields in source and only then I will be able to join on them in Quicksight

At the end you need three datasets.
The two base datasets with the calculated fields and one where you join the two base datasets.

Hi @Annab
could you solve your issue?
BR

1 Like