Optimising PeriodOverPeriodLastValue Calculations

I have a Clustered Bar Chart to show a count of values for each day over the last 7 days alongside the 7 days before that, and the 7 days from 52 weeks ago, from data in a Redshift Cluster. For this I simply add the field to the Bars well, along with the following calculations:

periodOverPeriodLastValue(count(jobaccepteddatetime), firstseendatetime, DAY, 7)
and
periodOverPeriodLastValue(count(jobaccepteddatetime), firstseendatetime, WEEK, 52)

This results in the correct chart, except that it usually times out. What should be very simply counting Jobs for a total of three weeks, is actually counting every Job for every day in the dataset, and doing so separately for each calculation. Instead of 21 counts, it’s doing 1,140.

I’m filtering to the last 53 weeks, because if it’s any lower than that I don’t see the LastYear counts. I’ll tweak it to omit the useless 50 weeks in the middle, which should alleviate the majority of the issue, but is there not a better way of doing this than running the same calculation three times and left joining to itself?

Hello @lgulliver, welcome to the Quicksight community!

When you say that you are running the calculation three times and left joining afterwards, do you mean that you are running the calculations in your data set? Or are you doing this in your analysis?

For the 52 week calculation, instead of counting all 52 weeks, is it possible in your dataset to have the period be Year and offset by 1?

1 Like

Hi Duncan,

I can’t change the dataset itself; one option may be making a new table in Redshift, probably containing the daily record count, but I know that will cause problems in the future as we will probably want to add filters for the users to see certain types of records in any number of combinations.

This is what the Visual looks like. Jobs Per Day is simply a straight COUNT, while the other two bars are the calculations described above. The idea is to get a sense of how the number of Jobs is changing over time.

I’ve only been messing around with Quick Sight for a week or so, so forgive my inexperience, but as far as I’m aware I can’t run any of the calculations manually. It’s not me running the calculations, it’s Quick Sight, and I can’t see any way to finely control that.

However, I’ve since done a bunch of optimisation work on this… The main part is applying a Filter to the Visual to exclude the middle 50 weeks that are unused, but a few other bits too. The Visual now loads in about 41 seconds, which still isn’t great but it’s a lot better. If I just run the COUNT on the same data, without shuffling each week into its own column, it completes in under 24 seconds, and there shouldn’t be much overhead in adjusting a few dates from that.

So I still have some tweaks to make, but if you know a better way of calculating this data, I’d love to hear it. And thank you for the welcome =)

Hey @lgulliver,

If this is your first week with Quicksight, I highly recommend checking out the Quicksight Workshops!:

The exercises are helpful for getting more familiar with Quicksight’s tools and features.

One thing you could try to help with your load times is adding your calculated fields to the dataset in Quicksight:

Let me know if that helps in cutting down the load time!

1 Like

I did try adding the calculated fields to the dataset level (this is useful to be able to put them in folders) but it didn’t seem to make a difference.

But I now feel like a bit of an idiot, having figured out the main reason for the slow query - the dataset is linked to a View for some calculated fields, including an offset for daylight savings to use in some time-based Visuals. Filtering on this calculated column means it’s running that calculation on the whole dataset to figure out what it needs to filter to. Swapping this to filter on the stored datetime means I will need to process an additional two days, but this brings the query time to under 7 seconds, which is much more reasonable. It does still run quicker if I write my own query, but the difference is less than a second, so I’m not too upset.

Thank you for your help and the suggestion on the Workshop, I will definitely check that out!