Time based sub-sampling of large datasets (stock calculations)

I’m trying to do something similar to what is being done on demo central, where a time based interval is used to sub sample large datasets here. I’m struggling to figure out why I can’t filter out some dates based on user selected time period intervals.

On the demo central example above they actually need to hack a bit to get QS to do the sub sample where the user submits a time based interval of DD, WK, MM, YYYY. The actual date field is not being directly used as the X axis but instead a calculated field called Date String, which just converts date to a string, which then makes the calculation work. I believe this is being done because toString skips rows containing null values..

I have finally figured out how to calculate some common transformations on stock price data in QS such as:

Performance

(
firstValue(
    avg(value),
    [date DESC],
    [symbol, truncDate("DD", date)]
) - 
lastValue(
    avg(value),
    [date DESC],
    [symbol]
) 
) / lastValue(
    avg(value),
    [date DESC],
    [symbol]
)

EDIT: the issue with return is that I end up filtering out the last value when I apply any flag, therefore the below formula just returns 0 for all non DD date aggregation.
Return (excluding Daily)

(
firstValue(
    avg(value),
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
) - 
lastValue(
    avg(value),
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
) 
) / lastValue(
    avg(value),
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
)

These formulas both work great, however when the user selects any granularity other than DD it returns a continuous series for all dates like below and creates a less than ideal step chart.

So with that I’m trying to figure out how I can flag based on the Date Aggregation selection to filter out some unneeded dates.

The demo central dashboard does it like this:

ifelse(Date=maxOver(Date,[truncDate(${DateInterval},Date)],PRE_AGG),'True','False')

However as I mentioned above they are using this proxy called Date String as the X axis in their visual which allows this to work. I can confirm that if I do this and just want to display value and not my return or performance calculation above that this works. However I would like to display performance and return which rely on me using my date field in both the calculations and including the field in the visual.

Fails

ifelse(
    date = maxOver(date, [truncDate(${aggregationDate}, date)], PRE_AGG),
    'True',
    'False'
)

Fails

ifelse(
    date = max(date, [truncDate(${aggregationDate}, date)], PRE_AGG),
    'True',
    'False'
)

Fails

ifelse(
    date = maxOver(left(toString({date}),10), [truncDate(${aggregationDate}, date)], PRE_AGG),
    "True",
    "False"
)

I have tried an ifelse with in and notin, but I’m passing a truncDate to it and in many instances no values exists for the returned truncDate (markets could be closed on the last day of the month, week…etc). So this needs to be dynamic and based off the data, some type of max date calculation.

Any idea how I could do this?

Possible Solution
I might be able to preprocess some additional columns from my backend such Day of the week, Week number, Month as a string, Quarter as a string, and Year. Then I can dynamically concat based on user selections and I think max might work.

ifelse(
    date = max(date, [concat(Month, "-" Year)]),
    'True',
    'False'
)

Would then hopefully flag properly for MM user selection.

Hello @Sean_B !

Did you the solution you mentioned at the bottom of your post work?

1 Like

@duncan for the performance calc yes, unfortunately not for the return calc as I need both first and last value for the given time period. I need to somehow adjust the filter to keep both the first and last date for a time period granularity, so my return calculation does not return 0.

Hello @Sean_B, my apologies for the delay! Are you still struggling with this issue in QuickSight? If so, have you uncovered any new information about the problem?

1 Like

@DylanM I’m still having an issue with the return calculation. In that calc I’m comparing the first and last value so when the filter is applied it filters out all values except the last value. By doing so it returns 0 for all aggregations other than DD (day) as I’m just comparing the last value against itself.

So I need to find a way to perform a filter that waits until after the return calculation is done somehow. Maybe some type of level aware calculation.

My calc for return:

(
firstValue(
    avg(value),
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
) - 
lastValue(
    avg(value),
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
) 
) / lastValue(
    avg(value),
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
)

Hello @Sean_B, this is a shot in the dark because I have a feeling it will throw an error but it might work. You could try creating either a seperate calculated field for value like this:
avgOver({Value}, [], PRE_AGG)
You should be able to leave the brackets empty. Then attempt to bring that calculated field into the above calculation instead of value:

(
firstValue(
    {ValueCalc},
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
) - 
lastValue(
    {ValueCalc},
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
) 
) / lastValue(
    {ValueCalc},
    [date DESC],
    [symbol, truncDate(${aggregationDate}, date)]
)

This will probably break it, but maybe it will surprise us.

Hello @Sean_B, did my response help you resolve the issue you were facing in QuickSight. If so feel free to mark it as a solution or follow-up with some more information you gathered in testing so we can find an answer. Thank you!