Since the 22/10 the full refresh doesn’t work well.
The dataset is refreshed from Athena (same AWS account). On the 21/10 the scheduled refresh had 31080386 integrated rows and on the 22/10 the scheduled refresh had 1667679 integrated rows. When I manually refreshed the data it worked fine with all of the rows.
In a new scheduled refresh I created (about 2 and a half hours after the previous one) we had no problems - all the data was uploaded. And again after manual refresh the data was partial.
The glue crawlers are scheduled for 2 hours earlier so the data should be fine at the refresh.
No error in QS - the dataset is completed with no errors and with no skipped rows.
Please help solve this instability - I opened a case in the support portal (Case ID 11091723501) but still got no answer. This affects the data integrity in QS.
Are you doing an incremental refresh?
Can you show a screen shot of the history of your refreshes?
Lastly, do you have a where clause that might be affecting it?