Hi, I have a sample test string value below. The string is patterned like an array of hashes or dictionaries. I want to get the name values with type type02.
Expected output: id = “test03, test04, test05”
Sample test string:
data = "[{'testdata': 'has', 'name': 'test01', 'type': 'type01'}, {'testdata': 'does not have', 'name': 'test02', 'type': 'type03'}, {'testdata': 'has', 'id': 'test03', 'type': 'type02'}, {'testdata': 'has', 'id': 'test04', 'type': 'type02'}, {'testdata': 'has', 'id': 'test05', 'type': 'type02'}]"
At first I used calculated field but I can only retrieve the first type02 info in the string.
ifelse(
contains(split(data, ',' , 2),'type02'),
replace(replace(split(data, ',' , 2), "'id':", ''), "'", ''),
NULL
)
output: id = “test03”
Since the value datatype is a string and there are a few ids that I need, I am trying to use custom SQL and create a new table field.
In Athena, I trimmed the data value then tried to use json_extract_scalar() to get the key-value data inside but was unsuccessful.
trim('[]' from data) as trimmed_data
json_extract_scalar(replace(replace(trimmed_data, '''', '"'), ': False', ': false'),'$.id') AS id
Do you have any suggestions on how to parse this type of data? Thank you.