How to make condition with CountOver

I have a dataset that records the status of each step of a data pipeline as an example below:

tableName dataSource stage status
Apple A ingestion COMPLETE
Apple A preprocessor COMPLETE
Apple A data_lake COMPLETE
Banana A ingestion COMPLETE
Banana A preprocessor COMPLETE
Banana A preprocessor COMPLETE
Banana A preprocessor COMPLETE
Banana A preprocessor COMPLETE
Banana A preprocessor FAIL
Banana A preprocessor FAIL
Mango B ingestion COMPLETE
Cherry B preprocessor COMPLETE
Raspberry C ingestion COMPLETE

I want to create a table on QuickSight Analysis to see the number of COMPLETE and FAIL for each dataSource and stage.

dataSource stage COMPLETE FAIL
A ingestion 2 0
A preprocessor 1 1
A data_lake 1 0
B ingestion 1 0
B preprocessor 1 0
C ingestion 1 0

However, some of the tables in some stages have more than one data points and are counted toward the statuses when they’re summed in the calculated fields I created [COMPLETE asifelse(status = "COMPLETE",1,0) and FAIL as ifelse(status = "FAIL",1,0)] which make the table look like:

dataSource stage COMPLETE FAIL
A ingestion 2 0
A preprocessor 5 2
A data_lake 1 0
B ingestion 1 0
B preprocessor 1 0
C ingestion 1 0

I am using CountOver to sum the number of COMPLETE and FAIL in each stage for each table then use a condition to check whether there is a FAIL
COMPLETE > ifelse({status}="FAIL" AND countOver({status},[{status},{tableName},{stage}])=0, 1, 0)
FAIL> ifelse({status}="FAIL" AND countOver({status},[{status},{tableName},{stage}])>1, 1, 0)

But this doesn’t seem to work. Does any have any idea how to countOver with condition?

Hello, You could achieve your desired output alternatively by distinct Count on tableName. I am attaching the screenshot for you to try this approach. Please let me know if this helped you solve the problem.

2 Likes

Hi @pattdf

in your example:
A/ingestion/Complete appears twice once for Apple, and once for Banana → and you want to count it as 2 under Complete
A/preprocessor/Complete appears 6 times, once for Apple and 5 times for Banana: however, you want to count it as 1 under Complete.

Why?

If you are trying to count the ‘status’ by dataSource and stage only ONCE for each tablename then use

distinctCountOver(status, [status,dataSource,ingestion], PRE_FILTER)

If this is what you’re looking for, please mark this as a solution.
:slight_smile:
GL

2 Likes