String operations in quicksight to work on "key:value" pairs

I have a column in one of the athena tables that has data in format such as given below

“key1:a”, “key2:a”, “key3:c”
“key1:a”
“key1:a”, “key3:c”
“key4:d”

some of the rows even have up to 35 “key:value” pairs in them

is there a way to create an option with a dropdown (Filter preferrably)

that only contains the unique “keys” from the “key:value” pairs.

so the above output will be

As a drop down filter

“key1”
“key2”
“key3”
“key4”
.
.
.
“keyn”

Yes, there is a way to achieve this using Athena and AWS Glue. Here’s a step-by-step approach:

  1. Create a UDF (User-Defined Function) to extract the unique keys from the column:

    • You can use the SPLIT function in Athena to split the column by the delimiter (:) and then use ARRAY_DISTINCT to get the unique keys.
    • Here’s an example UDF definition:
    CREATE FUNCTION extract_unique_keys(input_string STRING)
    RETURNS ARRAY<STRING>
    COMMENT 'Extracts unique keys from a string of key-value pairs'
    RETURNS NULL ON NULL INPUT
    LANGUAGE python
    RUNTIME_VERSION 3.9
    HANDLER 'extract_unique_keys.handler'
    AS """
    import json
    
    def handler(input_string):
        if input_string is None:
            return []
        keys = set()
        for pair in input_string.split(','):
            key, _ = pair.strip('"').split(':')
            keys.add(key)
        return list(keys)
    """
    
  2. Create a new table or view that uses the UDF to extract the unique keys:

    • Here’s an example of how you can create a new table or view:
    CREATE OR REPLACE VIEW my_table_with_unique_keys AS
    SELECT
        my_original_table.*,
        extract_unique_keys(my_column_with_key_value_pairs) AS unique_keys
    FROM
        my_original_table;
    

    This will create a new table or view that has all the columns from the original table, plus a new column unique_keys that contains the list of unique keys extracted from the column with key-value pairs.

  3. Use the unique_keys column in your filter or dropdown:

    • In your application or reporting tool, you can now use the unique_keys column as a filter or dropdown, allowing users to select the desired keys and filter the data accordingly.

This approach allows you to pre-process the data and extract the unique keys, making it easier to use in your application or reporting tool. The UDF can be optimized further if you need to handle large amounts of data or very long key-value pairs.

2 Likes

Hi Murili,

the thing is i want to do this only in the quicksight front end. Because i worked around a query that is separates it (in a seperate data set).

here my if query helps if it

note(the VALUES (1), (2), (3), (4), …, (100) where … are actual numbers upto 100)

WITH split_keys AS (
  SELECT ID,
    TRIM(BOTH '"' FROM split_part(split_part(key_values, ',', n), ':', 1)) AS key
  FROM 
    table_name
  CROSS JOIN 
    UNNEST(SPLIT(key_values, ',')) AS t(split_column)
  CROSS JOIN 
    (VALUES (1), (2), (3), (4), ..., (100)) AS numbers_table(n)
  WHERE 
    n <= CARDINALITY(SPLIT(key_values, ','))  
)
SELECT distinct(ID),
  key
FROM 
  split_keys;

but the bigger problem comes when i try to visualize it

reason i work with cost data: some sample examples below of current situation.
image

now I was trying to link these in quicksight with the actions by having a common mapping

I would like to eliminate the creation of the second dataset all together.

But I was wondering if it was possible to do this directly in quicksight front end?

the table separation as shown above but directly inside a filter value with a calculated field.

Did this solution work for you? I am marking this reply as, “Solution,” but let us know if this is not resolved. Thanks for posting your questions on the Quick Sight Community!

Hi Xclipse,

yes that worked.

1 Like