Remove old data from SPICE dataset without full refresh

My question is pretty similar to the one posted in the link below, so understanding that removing old data is not possible at this moment, but it is something faced by different people, is it a feature that can be requested to Quicksight developers? In my case, the problem is that I am extracting information from Athena, but loading only the last 7 days is taking more than 15 min, I was planning to extract only 1 or 2 days and make an incremental load, but I would like to remove records older than 30 days without affecting the existing information. If I run a full refresh, it will erase everything and load only the last 2 days.

did you consider to setup 2 schedule?
one schedule to use incremental load in weekday
another schedule to do a full refresh in non-office hour or weekend?

Hi, thanks for the reply.

Yes, but I would be removing previous information. So, trying to give a little context, I have a dataset that pulls information out of athena (with information from CloudWatch). This dataset last around 15 min to load the previous 7 days (I had to set a limit condition in the query to this period cause the load was failing due to policies set on athena/lambda for long runs if I’m not mistaken). If I make a full refresh, I would leave only the last 7 days in SPICE, what I was asking is, since we set an incremental load, Quicksight will keep increasing its dataset, I was concern if there was any way to remove information older than X amount of days from the dataset without running the full refresh just to avoid leaving this SPICE dataset growing and growing.

As an alternative solution, what we are now thinking is to instead of querying athena/cloudwatch, set a process that sends information from cloudwath to s3, so it would be easier for us to extract information out of s3 instead.

Hope it makes sense.

Hello @griverar22 and @royyung !

@griverar22 were you able to try the alternative solution that you mentioned at the end of your last post, and if so was it successful?

If that measure did not work I can mark this as a feature request for the Quicksight team.

Hello @griverar22 and @royyung !

@griverar22 have you attempted the potential solution you mentioned in your last post, and if it was successful could you share the steps you took to help the community?

It has been some time since we have heard from you but we would still like to help you find a solution. If we do not hear from you in the next 3 business days this topic will be archived.

Hi! thanks for following up this topic. Actually yes, we had to go through S3 to avoid the issue of requesting a lot of information to CloudWatch, what we did was just to send the raw data from CloudWatch into S3, and Quicksight retrieves this data from S3 instead.

However, the main concern of Quicksight not having a way to restrict or delete information older than X amount of days is still a topic that I feel would be beneficial for everyone. At this moment, you can only do this through a full-refresh, but assuming that full-refresh pulls the last 12 months of data, and you set another incremental-run to refresh the last day, after another 12 months you will have 2 years of data in SPICE, but if you want to manage the capacity of SPICE for your account and keep only the necessary information on it without consuming and paying that extra space, there is no way to have a dataset of 18 months for example, if you run the full-refresh here, it will remove everything and leave you with 12 months (and you are loading information that its already there because the incremental-run put it there). You can even put the query to extract only previous day, which makes the query faster and consume less resources of the database and let the incremental-run save the data into the SPICE dataset, now once you have a big dataset of years of data, remove old data to maintain the same amount of SPICE in your account.

Not sure if I could be wrong and there is a way to maintain this or load this data into SPICE in another way, in such case I would love to hear from your expertise.

I appreciate your time and the support you all give to this community.

Regards.

1 Like

Hey @griverar22 !

Thank you for your detailed response and breaking down how you were able to work around this issue. I have run into my own issues with incremental refresh in the past so I will definitely consider your workaround in the future.

For now I will mark this as a feature request for the Quicksight team to take a look at.

Thank you again!