I have a SPICE dataset in quicksight which is having athena as the datasource. I configured incremental refresh to it. The refresh was working without any issues previously, but I observed recently that some entries are missing.
I have checked the the Athena query execution result (taken from the Query history based on the timestamp when the spice refresh is run and I could ‘x’ number odf records) and it contains my required data entries and I could see same number (‘x’) of records were ingested into SPICE as well during the incremental refresh run but unable to see the required entries when checked in visual.
What could be the reason for this? Any Idea???
Hi
Have you configured both Full refresh and incremental refresh or just incremental refresh on your dataset?
Hi @Shahid_Muhammad,
I have configured only incremental refresh.
Hi
Please configure a full refresh either daily or weekly, based on your requirements. The missing data may be due to the fact that the window set for incremental refresh is too short.
Hi @Shahid_Muhammad ,
The fact is that we never had this issue earlier. And we cannot configure a full refresh to my dataset because we need at least past 30 days of data in SPICE DB and if we design the query to fetch 30 days of data it was giving us QueryTimeOutException as we are having huge amount of data, so it is exceeding 20 minutes. Due to that, we have designed the query to fetch only past 2 hours of data and configured incremental refresh by that data keeps on accumulating.
Hi @Nikhitha
if that is the case you may try increasing the window of incremental refresh to 2 hrs. to may be 6 hrs. but in that case, you need to thoroughly check your data for duplications as well.
@Shahid_Muhammad,
number of records that were ingested are correct, but records are missing when I visualize the data in a table, and I would like to know why this has happened as well.
Could also be row level security if that has been implemented. Otherwise validate what was loaded from the incremental refresh of 2 hours and check if it exists in the source by querying with Athena .
hi @Koushik_Muthanna ,
I haven’t implemented RLS. The number of rows that were ingested during the incremental refresh is same as that of the result when we checked in Athena. And, the query result of Athena contains the missing records.
Are there any dataset level filter implemented that is not satisfying condition now? or any sheet filter that may be impacting the visual?
You can try to identify the date since when this is occurring and then check for values against applied filters, if filters are implemented. Since this is about few missing entries and not all the entries, this can be starting point to establish which few are missing.
Hi @Nikhitha,
It’s been awhile since we last heard from you, did you have any additional questions regarding your initial post?
If we do not hear back within the next 3 business days, I’ll go ahead and close out this topic.
Thank you!
hi @prantika_sinha ,
There is no dataset level filter that is applied, and I have created just a plain visual and tried to view the rows without any filters applied on the visual. I am not able to see there my missed records. Yes, this miss happened for only few records only and we have observed that this is happening intermittently. I have noticed this miss happening for a different incremental refresh which was run recently.
In the increment refresh status, do you see any skipped row count?
Hi @Nikhitha
ingested rows and data from main source are matched or not?
I am assuming if the records are seen in QS query editor, it is showing the sample record based on the query written. However your refresh window will decide which records to ingest. So these are not captured in skipped rows as well.
From the skipped records, will you be able to verify what are the values in the date field that is used to configure refresh window? Are they fulfilling the 2 hour timestamp?
Hi @Nikhitha,
It’s been awhile since we last heard from you, did you have any additional questions regarding your initial post?
If we do not hear back within the next 3 business days, I’ll go ahead and close out this topic.
Thank you!