Incremental refresh_issue (updated records and new records)

Hi team

I Have scheduled incremental refresh for my dataset (in my data i am getting daily updated records and new records with the same date but here total records getting added in the existing dataset )

here new records only to be add to the existing table

in the below image ingested rows will be 77 ,in this 37 will be new records and 40
will be updated records

total records should be 55626+37=55683 only

in redshift iam adding new records and updated records based on tag_id(unique field)

in red shift iam getting 55683 records only

is there any way to do like that

thanks
b vinod kumar

Hi Vinod,

What is your look-back window for SPICE incremental refresh?

Since you are doing daily update, increasing your look-back window to larger time frame (upto a time window when records are opens for modification) say 1 week will help to accomplish it.
Please check and reply.

Excerpt from QuickSight documentation on explaining look-back window:

With incremental refreshes, less data is queried and transferred for each refresh. For example, let’s say you have a dataset with 180,000 records that contains data from January 1 to June 30. On July 1, you run an incremental refresh on the data with a look-back window of seven days. QuickSight queries the database asking for all data since June 24 (7 days ago), which is 7,000 records. QuickSight then deletes the data currently in SPICE from June 24 and after, and appends the newly queried data. The next day (July 2), QuickSight does the same thing, but queries from June 25 (7,000 records again), and then deletes from the existing dataset from the same date. Rather than having to ingest 180,000 records every day, it only has to ingest 7,000 records.

Regards,
Anwar Ali

Hi,

Can you please try to import complete data at once and see the count should be the same as in Redshift. After that enabled the incremental refresh. :slight_smile:

Today I face the same issue on my one data.

Regards,
Naveed Ali

Hi @Naveed
that is exact i have did already

first i have loaded the all data in the spice then only i have scheduled the in incremental refresh

now i got the solution

i ill explain

In the red shift I have written the query like this

BEGIN;
INSERT INTO qa_rims_data.public.Master
SELECT *
FROM qa_rims_data.public.Staging
WHERE tag_number NOT IN (SELECT tag_number FROM qa_rims_data.public.Master);

– Update existing records in qa_rims_data.public.Master from qa_rims_data.public.Staging
UPDATE qa_rims_data.public.Master m
SET
tag_number = s.tag_number,
“date” = s.“date”,
quantity = s.quantity,
Disposition = s.Disposition,
Grade = s.Grade,
“Return Reason” = s.“Return Reason”,
Unitprice = s.Unitprice,
Recovered = s.Recovered,
Status = s.Status,
“Process Status” = s.“Process Status”,
Processed = s.Processed,
Generated = s.Generated,
Channel = s.Channel,
Brand = s.Brand,
own_label = s.own_label,
“Site Name” = s.“Site Name”,
Category_l1 = s.Category_l1,
Category_l2 = s.Category_l2,
Category_l3 = s.Category_l3,
“Source Code” = s.“Source Code”,
latitude = s.latitude,
longitude = s.longitude,
State = s.State,
City = s.City,
dispatch_date = s.dispatch_date,
inward_user_id = s.inward_user_id,
“Destination Code” = s.“Destination Code”,
grader_name = s.grader_name,
inwarding_disposition = s.inwarding_disposition,
“Actual Tat” = s.“Actual Tat”,
“Tat Status” = s.“Tat Status”,
“Expected Tat” = s.“Expected Tat”,
“Return Type” = s.“Return Type”
FROM qa_rims_data.public.Staging s
WHERE m.tag_number = s.tag_number ;

COMMIT;

like these same query i have written another way like this

BEGIN;

– Delete records from qa_rims_data.public.Master where tag_number matches
DELETE FROM qa_rims_data.public.Master
WHERE tag_number IN (SELECT tag_number FROM qa_rims_data.public.Staging);

– Insert all records from qa_rims_data.public.Staging into qa_rims_data.public.Master
INSERT INTO qa_rims_data.public.Master
SELECT *
FROM qa_rims_data.public.Staging;

COMMIT;

i think it will be work for my case