Lately I’ve tested ingestions and found some inconsistency. Let me describe steps I made and then my thoughts.
Start SPICE dataset ingestion (from UI or CLI/API)
Start another ingestion (from UI or CLI/API) - it will be queued
Cancel 1st ingestion
Run 3rd ingestion
Result:
1st ingestion in status FAILED with error ‘INGESTION_CANCELED‘
2nd ingestion is QUEUED and cannot be canceled nor forced to run. Stuck in queue forever
3rd ingestion COMPLETED
What I think should be:
1st ingestion should be in status CANCELED (status is there for some reason), not FAILED
2nd ingestion should run after first is finished regardless the status
I can cancel 2nd ingestion and set status to CANCELED
I’m also missing one definite way to determine from CLI/SDK what is the current status of dataset": if it’s refreshing or failed… Should I always take latest ingestion?
That is definitely some interesting behavior you are encountering. As I do believe this could be a bug/error you are encountering on your end, I will leave this post marked as an error so that someone from the AWS Team may address it.
As for your other question, I would generally say to take the latest ingestion at all times since that is the most recent one. However, I would definitely recommend creating a support ticket with AWS Support in the meantime, as they may be able to help with this issue further.
Since we have not heard back from you, I’ll go ahead and close/archive this topic. However, if you have any additional questions, feel free to create a new topic in the community and link this discussion for relevant information.