Lag function isn't giving me a desired result

Hi there,

I Have a calculation where i am trying to spot two month continous drop of a bunch of clients but the following calculation not providing me what i expected :innocent:, instead some are right while the others are wrong. can i have some solution to solve this pls, i would very much appreciate that.

droppercent - difference percentage compared to last month(calculated in sql)

ifelse((lag(sum(droppercent), [host ASC], 1) < 0) AND (sum(droppercent) < 0), ‘2month continous drop’, ‘1monthdrop’)

Hi, It will be helpful if you can share dummy data and explain how lag function is not working as expected. Also have you checked if your calculation for lag needs any data partitioning?

Hi karthik,

As you can see here in the pic, after applying the calculation, few are giving the correct values whereas the crossed ones are giving wrong values. basically i am comparing the current drop percent with the previous month drop. May i know why is it giving wrong values?

I tried the following calculation as well but same result

ifelse((lag(sum(droppercent), [host ASC], 1, [month, year]) < 0) AND (sum(droppercent) < 0), ‘2month continous drop’, ‘1monthdrop’)

Hi, I think for your requirement you will need to sort the data by date field and partition by host column cause you need to see per host what are the prior month values.

I created a sample dataset based on your screenshot and add two calculations (Test and Test2)
Test is based on your logic
ifelse(lag(sum({Var%}),[host ASC],1,[Date])<0 and sum({Var%})<0,‘2month continous drop’,‘1monthdrop’)

Test2 is the new logic that I create for validating this scenario
ifelse(lag(sum({Var%}),[Date ASC],1,[host])<0 and sum({Var%})<0,‘2month continous drop’,‘1monthdrop’)

Please let me know if Test2 logic helps with your requirement.

Hi Karthik,
first of all, I appreciate your effort.

I am not sure how to achieve this still, is it necessary to use date rather than month? i have to use month coz the droppercent is not for invidual days but for the whole month. i used the following but i got 1month drop for all the host

ifelse((lag(sum(droppercent), [month ASC], 1, [host]) < 0) AND (sum(droppercent) < 0), ‘2month continous drop’, ‘1monthdrop’)

btw, i calulated the drop percentage for each hosts by month in the sql itself, can u figure out what could be the issue here? Do i have to use only the date to get what i wanted?

Hi, I just created a sample data set to simulate your use case.
Depending on the granularity of your data you should be able to get desired results. At a high level you are partitioning the data at host level and sorting the data using a specific field (in your case month) to get the prior value. [month ASC] is only used for sorting your data.
Try testing your logic by just adding a column with logic lag(sum(droppercent), [month ASC], 1, [host]) and see what values you get.

Hi karthik
i created two calculated fields(test2 and test3)

In test2, i am applying the following calulation
lag(sum(droppercent), [host ASC], 1, [month])

In test3, i am applying the following calulation
lag(sum(droppercent), [month ASC], 1, [host])

as you can see in the following screenshot when i tried to use test3 like u said, it spiting blank values for all of the rows whereas with test2 it gives me values.

I am not really sure, do you think the blank values are because i am already using lag funtion in the query to calculate the drop percent?

Hi, I don’t think having test2 column with lag function is causing test3 to display null. Can you share a screenshot of the entire report (showing all columns that are part of the report) provided there is no sensitive data? Also is your ‘month’ column a date data type or is it a string?

Hi karthik,

Here is my query and it gives me the result in the below screenshot, variance is(online order-previousonllineorder/previousonlineorder). I did that caluclation in the sql, after then i am using the same field to compare it with the previous month in the quicksight. so i should use lag one more time either in quicksight or redshift. month is bigint

SELECT

host,

sum(online_order) as online_orders,

date_part(mon, sa.date) as Month1,

date_part(year, sa.date) as year1,

(cast(online_orders - LAG(online_orders) OVER(Partition by host ORDER BY year1, Month1) as NUMERIC(10,2))/NULLIF(LAG(online_orders) OVER(partition by host ORDER BY year1, Month1) * 1, 0)) AS droppercent

FROM stats_archive

GROUP BY Month1,year1, host, variance

Hi, Below is the sample data and calculations I built. Hope this helps.
If you are still facing issue then we would need dummy/sample dataset, sample report screenshots, any filters applied on the reports and calculated columns from your end to replicate this on our side to further assist you on this.

Sample Date :

host Year Month Orders droppercent
4 2021 12 1039 0.264
130 2021 12 3794 0.0657
131 2021 12 111 -0.112
132 2021 12 1232 0.209
136 2021 12 847 0.2135
137 2021 12 877 -0.244
145 2021 12 1016 -0.2758
146 2021 12 1097 0.351
150 2021 12 268 0.3073
152 2021 12 1407 0.1009
153 2021 12 226 -0.2391
157 2021 12 328 0.0933
161 2021 12 943 0.1922
164 2021 12 1533 0.1977
175 2021 12 743 -0.0523
4 2022 1 921 -0.1136
130 2022 1 3624 -0.0448
131 2022 1 102 -0.0811
132 2022 1 1271 0.0317
136 2022 1 777 -0.0826
137 2022 1 506 -0.423
145 2022 1 786 -0.2264
146 2022 1 1055 -0.0383
150 2022 1 247 -0.0784
152 2022 1 1298 -0.0775
153 2022 1 123 -0.4558
157 2022 1 272 -0.1707
161 2022 1 850 -0.0986
164 2022 1 1458 -0.0489
175 2022 1 641 -0.1373
4 2022 2 807 -0.1238
130 2022 2 3119 -0.1393
131 2022 2 125 0.2255
132 2022 2 896 -0.295
136 2022 2 490 -0.3694
137 2022 2 392 -0.2253
145 2022 2 637 -0.1896
146 2022 2 902 -0.145
150 2022 2 189 -0.2348
152 2022 2 1210 -0.0678
153 2022 2 43 -0.6504
157 2022 2 384 0.4118
161 2022 2 848 -0.0024
164 2022 2 1244 -0.1468
175 2022 2 535 -0.1654

Lag_Test : lag(sum(droppercent),[Year ASC, Month ASC],1,[host])
Drop Calc : ifelse((lag(sum(droppercent), [Year ASC ,Month ASC], 1, [host]) < 0) AND (sum(droppercent) < 0), ‘2month continuous drop’, ‘1monthdrop’)

Report Screenshot:
Table on left shows data for Jan 2021 and Table on right shows data for Feb 2022.
Lag_Test shows the value piked from prior month/year (in this case Jan 2022)

Hi Karthik

First of all sorry for the late response

Can we achieve the same using the following scenario? for instance, instead of calculating this drop percent in sql, can we calculate in the quicksight itself

drop percent : (sum({online_order}) - lag(sum({online_order}), [{order_date} ASC],1,[host]))/lag(sum({online_order}), [{order_date} ASC],1,[host])

Then i want find out the drop percent of current and previous month by applying it on your calculation right here

ifelse((lag(sum(droppercent), [{order_date} ASC], 1, [host]) < 0) AND (sum(droppercent) < 0), ‘2month continuous drop’, ‘1monthdrop’)

something like this

ifelse((sum({online_order}) - lag(sum({online_order}), [{order_date} ASC],1,[host]))/lag(sum({online_order}), [{order_date} ASC],1,[host]) < 0 AND lag((sum({online_order}) - lag(sum({online_order}), [{order_date} ASC],1,[host]))/lag(sum({online_order}), [{order_date} ASC],1,[host]), [{order_date} ASC],1,[host]) < 0,‘2monthdrop’,‘1month’)

is that even possible? again, appreciate your efforts karthik