Full Join: Duplicated Data

I want to join two data. The data from the year 2024 (in S3) and the data from 2023 (from a .csv). The two data sources have the same columns and data type. When doing a full join, the columns are duplicated. I want the rows from the .csv to be added to data set 1.

Hi @julia_Faraudello
sounds to me the join is the wrong way to do it. You would need to merge/append the files, to keep the original columns.
BR

@julia_Faraudello ,

As you have the 2024 data in S3 , if you can upload the 2023 data as well to S3 , you should be able to create a dataset and that would combine the data from both files ( Datasets based on multiple Amazon S3 files - Amazon QuickSight )

Kind regards,
Koushik

2 Likes

Hi @julia_Faraudello

As mentioned above, joining these two datasets will not solve your current use case. Quick Sight currently does not support unioning of data but this can be achieved through a seperate data prep tool such as Glue or Glue Databrew. Alternatively as @Koushik_Muthanna mentions above, adding the CSV to the same bucket as your first data source and referencing this S3 bucket in your Quick Sight manifest file will automatically union these datasets for you.

Please reach out if you need any guidance on the steps to do so.