Hi Team, our use case is that we have a source data which update record for each Id, thus in that source table there is only one latest record rather than historical data.
source table on 5/21/2024 midnight:
id last_update_date
111 5/20/2024 00:00:00
222 5/20/2024 00:00:00
333 5/20/2024 00:00:00
source table on 5/22/2024 midnight(111,222 got update):
id last_update_date
111 5/21/2024 00:00:00
222 5/21/2024 00:00:00
333 5/20/2024 00:00:00
source table on 5/23/2024 midnight(333 got update):
id last_update_date
111 5/21/2024 00:00:00
222 5/21/2024 00:00:00
333 5/22/2024 00:00:00
Now, since we want to showcase historical metrics, we want to leverage incremental refresh to append data daily. We creat a new column : DATE_FORMAT(current_timestamp, â%Y-%m-%d %H:%i:%sâ) AS snapshot_date,
this is the timestamp when the refresh actually happens
Expected dataset:
|Id |last_update_date|Snapshot|
|111|5/20/2024|5/21/24 0:00:00|
|222|5/20/2024|5/21/24 0:00:00|
|333|5/20/2024|5/21/24 0:00:00|
|111|5/21/2024|5/22/24 0:00:00|
|222|5/21/2024|5/22/24 0:00:00|
|333|5/20/2024|5/22/24 0:00:00|
|111|5/21/2024|5/23/24 0:00:00|
|222|5/21/2024|5/23/24 0:00:00|
|333|5/22/2024|5/23/24 0:00:00|
However if we set incremental refresh on last_update_date, look back window 1 day, refresh daily, we get this:
|Id |last_update_date|Snapshot|
|111|5/20/2024|5/21/24 0:00:00|
|222|5/20/2024|5/21/24 0:00:00|
|333|5/20/2024|5/21/24 0:00:00|
|111|5/21/2024|5/22/24 0:00:00|
|222|5/21/2024|5/22/24 0:00:00|
|333|5/22/2024|5/23/24 0:00:00|
if we set incremental refresh on above new created snapshot_date column, look back window 1 day, refresh daily, we get whole data fully replaced.
if we set incremental refresh on above new created snapshot_date column, look back window 1 hr, refresh daily, we also get whole data fully replaced, not appending(not sure why in this case previous day data all get deleted, although look back window is just 1 hour).
Can you advise how can we achieve above expected dataset?
Thank you!