I am trying to join two datasets based on a unique key (SPMS ID). For some of the records the value in the unique key has a comma (,). In the aggregated table by revenue, it is not aggregating the revenue corresponding to the record with the comma (,). How can we achieve that?
Eg. Datadog 3294, 3294 is not getting accounted for.
Hi @ErikG The join is working for the rows that do not have comma. Eg. It is populating the revenue against 3294, but it is not populating the revenue against 3294, 3294. My raw file is structured as below.
Name ID. Rev
Datadog 3294 100
Datadog 3294, 3294. 50
I need to show in results
Name Rev
Datadog 150
Today it is showing in results
Name Rev
Datadog 100
The ID is in string datatype since it has comma (,) in its values.
Hi @debanjana - Welcome to AWS QuickSight community and thanks for posting the question. Couple of observation on the file -
If the file is comma separated, I believe the record 3294,3294 should be double quoted embedded and that is why when you have imported to QuickSight, it is considered as one field.
Looking at the such sample record, it is the same record after comma(,). if you are OK to take one value, then @ErikG suggestion is good and you can use split function for those records contain a comma(,) for SPMS ID and take only the first record and ignore the second record and possibly do a cast to int as well. After that the join should work.
If you do not want to ignore and make this as 2 records, possibly you need put some intelligence spilt the same row into multiple row. This may be a tricky part and need some detail analysis.
Hi @Sanjeeb2022 and @ErikG Thanks for your responses. The SPMS ID field is string so it does consider in the dataset both the records before joining but after joining I dont get the revenue attributed to the record with the comma. The raw dataset has different revenues associated with “3294” record and “3294, 3294” record. So I dont want to ignore the record(s) associated with “3294, 3294”. Is there any office hour where I can join and someone can help me real time? Thanks!
Hi @debanjana
my suggestion is not to ignore the record it is to cut the string that you will get a field to join.
e.g.
Name | SPMS ID | Revenue | ID
Datadog | 3294 | 100 | 3294
Datadog | 3294, 3294 | 50 | 3294
The join can be done by the ID column but you will keep the SPMS ID if you need.
BR