Joined data sources skip rows

stuart · August 4, 2022, 2:17pm

I have been using Quick Sight successfully where I have datasets that are a single source. I am getting issues when I join multiple sources into one dataset.

I have a json file in S3 which is updated hourly. This is my call data. I have a dynamodb-athena table which is my customer data, which is also updated hourly.

I have these sources as individual datasets, and I have another dataset where I join the two sources together.

The individual datasets are reliable. Despite the odd occassion when the hourly refresh fails, the datasets work fine. I have an issue with the joined dataset, where a refresh unexpectedly ‘skips’ data. This is not considered a failure and I am not alerted. I only find out there is problem when my dashboard produces incorrect results, and I check the datasets.

The only information provided on the cause of the issue is a note in the dataset which says “Some of the rows scanned from the source tables were skipped”.

The really strange thing is the dataset never recovers the ‘lost’ data. Full refreshes continue to produce the problem. However, if I recreate the dataset from scratch, there is no problem, and all of the data is present.

For this reason, it appears to me like some sort of permanently corrupted SPICE cache. Or some sort of cache that is preventing the data from appearing during the next full refresh.

I have now experienced this issue 3 times, and each time I have had to recreate the dataset to fix the problem.

Has anyone experienced anything like this? I have been looking into whether I can force a deletion or clear down of the SPICE cache, but have not found a way to do this. It wouldn’t solve the problem anyway but would at least indicate if the cache is causing the error.

I have also been wondering if the joined dataset has any relation to the individual datasets. For example, could a RACE condition occur, where one of the underlying datasets is still refreshing when the joined dataset is trying to do so?

Feng_Lin · August 30, 2022, 4:32pm

Hi Stuart,

Could you email the relevant information (a link to your post, screenshots, the affected dataset URL) to quicksight-help@amazon.com? We will troubleshoot your issue over email.

Thank you.

Liam

WLS-P · October 14, 2022, 6:54pm

Hi, @stuart. Did @Feng_Lin’s solution work for you? I am marking their reply as, “Solution,” but let us know if this is not resolved. If this has been resolved, please add a post to this thread explaining the solution to help out other Community members that may have a similar issue.

Thanks for posting your questions on the Quick Sight Community Q&A Forum!

Topic		Replies	Views
Dataset used in another dataset - Keys not recognized Q&A spice , error , quick-sight , data-preparation	8	1376	October 24, 2022
Configure refersh interval for joined dataset Q&A data-source , database , quick-sight , Business-Intelligence-Engineer	3	51	August 6, 2024
Add Dataset left join, undefined exception Q&A admin , error , quick-sight	9	127	February 27, 2025
Quick Sight ran out of join memory while processing your request Q&A data-source , quick-sight , dataset , Business-Intelligence-Engineer	8	139	August 19, 2024
SPICE DataSet created from a left join is not ingesting any row Q&A quick-sight , dataset , Business-Intelligence-Engineer	5	190	June 1, 2024

Joined data sources skip rows

Related topics