How to get the count of individual items in a list

HarveyB-B · November 30, 2023, 11:07am

I have around 25 flags, each flag is coded A-Y.
I also have loads of flaglist. They will be a combination of the flags, based on what the customer selects.
To create the flaglist, we add all flags triggered when a user adds it to their list. They can only be selected once. So the list will never have multiple A’s, B’s etc. Only one flag per item.

Example:
I can see that flagA has been triggered 120 times. This is because i have it set to a KPI (count). However we can also see this is true as appeared in flaglist:
flaglist ‘ABCDE’ occurred 43 times
flaglist ‘AKLNO’ occurred 40 times
flaglist ‘AGI’ occurred 27 times
flaglist ‘AJ’ occurred 6 times
flaglist ‘A’ occurred 4 times
This adds up to 120

So i now have two tables, only where it shows the unique list with a count, and another where it shows the individual flag with a count.

The flaglist counts should always add up to the amount of times the flag has been triggered… However, i have a problem when creating one field which looks at all the hundreds of flaglists, and finding the count of each flag in one field. I have the calculated field as this (think of DeclineList as flaglist):

But when i have a look at the counts, they are incorrect:

Top image is the field i created to group the flags into one field - showing count of ‘X4’. The bottom is the actual count of ‘X4’

I think the reason it doesn’t work as it looks in DeclineList, and will look for ‘A’, if found it will +1 to ‘A’, and then look at the next list. So it skips any letters after the first one is found. I need it to be to work like this. If it contains ‘A’, add 1 to ‘A’, and check if it contains ‘B’, if so add 1 to ‘B’, then check for ‘C’, if so add 1 to ‘C’. Instead of - If it contains ‘A’, add 1 to ‘A’, check next list.

If confusing please respond and ill try make it make sense, but basically i want to group all my flags into one field so i can see the count of each flag in one field, instead of adding 25 flags to a bar chart.

wstevens01 · November 30, 2023, 3:31pm

Hi @HarveyB-B ,

Let me see if I understand. Given a few ‘flaglist’ strings like these:

ABCD
EA
FHI
FA
DF

You’re looking for the letter count, like this:

A: 3
F: 3
D: 2
B: 1
C: 1
E: 1
H: 1
I: 1

And it’s not just the letter count, there may be combinations of letters, like:
X1, X2, …, X10.

If that’s a description of the problem that we’re solving, a few clarifying questions, if I may:

What is the frequency of the data? How often does it change and how is it updated?
What is the ‘grain’ of the reporting? Other than the bar chart that shows the counts, is there a date/time dimension to the report? Are there other dimensions?
What filters are part of the display

There are techniques that you can use in Quicksight to count ‘known’ characters / character combinations. The posting here shows how you can count occurrences of a string from a parameter. That’s similar to the SQL command described in a youtube, here.

But, to solve this generically, for any arbitrary character combination, I think that you have to iterate through each combination. That likely points to solving this in the data source; e.g. you’d use a SQL technique and then have a table with every combination of characters and their counts for each dimension. At that point, Quicksight would be able visualize the count across all the dimensions.

ws

HarveyB-B · November 30, 2023, 3:59pm

First off, the only different between A, B, C, D, and X1, X2, X3, X4, is that the X’s are things we will not be changing. So it’s triggering something we won’t or can’t legally change. The alphabetic ones are things we can change.

The data doesn’t change, as in the flags will always be there. It’s whether or not that we adjust these flags will they change. They are fields in quicksight which i made, so they will most likely never change or be updated, unless a legal requirement comes in place and we have to change an alphabetic to an X. Even then it’s used to distinguish between legal requirements and things we can change, so the flag code doesn’t make a huge different. If you mean how often the count is updated is (earliest) every hour. This depends on whether or not a flag is triggered that hour.
each caseID we have will have a declinelist (based on filters applied), so yes it can be put against a date - whatever the date of the case is, is the date of the flag.
and for the filters, i have it showing data from oct 1st onwards (as more reliable). And i have a calculated field which groups cases into 3 (not exactly these but have the same format/idea) - pass, warning, decline.
I have it filtered to only showing decline.

Hopes this helps!

wstevens01 · December 1, 2023, 3:39pm

Hi @HarveyB-B ,

After a bit of thought, here’s a solution. I don’t love it (see reasons below), but it works.

Create a separate calculated fields for each flag value that you want to count in the flaglist string (or any other string). Something like these:

A_count:
(strlen(flaglist)-strlen(replace(toLower(flaglist), toLower('A'),''))) / strlen('A')

X1_count:
(strlen(flaglist)-strlen(replace(toLower(flaglist), toLower('X1'),''))) / strlen('X1')

This allows you to create correct aggregations across any number of dimensions (I added a date dimension and multiple, arbitrary flaglist strings for these visuals):

The reason I don’t like it, is that it’s not generic and you have to hard-code every flag value that you’re looking for. But it’s no worse than the big ifelse statement, in that respect.

I looked around for how others have solved it and it comes down to how do you perform the double iteration. You want to iterate over all of the flaglist strings, but you also want to iterate over every character in the string. SQL (and calculated fields) are not the best technique for this.

There are several examples of this problem on Stack Overflow, but this one asks the question, ‘Do you really think this is a job for MySQL?’ and I agree with that sentiment.

To solve this generically, for any combination of flag values, consider pushing this back to the data source. Then create a data design that works for you. Maybe a mini-fact table that holds the flag counts and a dimension table with all of the flag values (that’s just one of several viable design options). Next create a stored procedure or a stand alone program (like a Glue job in python) that performs the two levels of iteration and populates the fact table.

Hope that helps. Let me know if any of that works for you.

ws

HarveyB-B · December 4, 2023, 9:15am

Thanks for the response. I already have all flags which gets the correct flag, however i want it in one field so i don’t have to drag all 25 flags into a visual. I think this is possible, however i will have to do this outside of QuickSight. Yet again, thanks for the response, and will check out the stack overflow