Dataset schedule refresh fails due to schema change

I have the following problem. I have a number of datasets that I share with our customers. The datasets are being enriched with new fields regularly. To have data updated an ETL job runs and a schedule refresh is configured for all datasets. My problem is that every time I add a new field to a dataset, the scheduled refresh fails and I have to manually edit the dataset and click “Save and Publish” I suppose this happens because the schema has changed and Quicksight cannot figure this out. The result is that for some time the datasets are failing and customers do not see data. How can I handle this?? In what way can I modify dataset’s schema beforehand and apply this new schema only during the scheduled refresh? My end goal is to always have the datasets updated, refreshed with appropriate field names and data types w/o any failures.

Hey, I THINK that is the new field is the last column you would not have this error. not sure if you could add it at the end. Is just an idea to see how it goes.

Should be the same principle as the CSV or excel upload. If you add the new column between columns it fails, but at the end it adds it.

Hi @Fotis_flex,

Hope this message finds you well! Was the suggestion from carlsman helpful or were you able to find a workaround to your question in general? If not, please feel free to let us know if you are facing any persistent issues regarding the dataset. If we do not hear back in the next 3 business days, I’ll close out of this topic.

Thank you!

Hello @WLS-Luis
No, this comment doesn’t really help here. I am trying to find a way to update the dataset schema to “prepare” it for the new additional fields without though breaking its existing availability. It’s probably very tricky but I think it’s important if you have QS datasets being refreshed in a regular schedule while being enriched with additional fields from ETL pipelines

Hi @Fotis_flex,

Not sure if you already have a similar process running with your ETL pipeline, but have you explored the process highlighted in this documentation? I wonder if this could potentially assist with updating the dataset schema.

Hi @Fotis_flex,

Just checking back in to see if the documentation above was helpful to you at all, or if you were able to find a solution in the meantime. If not, please feel free to reach out again in this thread and we can see if there are any other potential workarounds.

Thank you!

Hi @WLS-Luis I am afraid this documentation is of no help as it describes a way to build the ETL pipeline which I have done on my own way already and works fine. My problem is not that, it’s clearly a way to make QS “know” that a data schema is about to change automatically and adjust to it.

Hi @Fotis_flex,

Thanks for clarifying on the issue. I’m not sure if you have already tried this, but I would recommend following this Quick Sight API documentation to see if it works for your ETL process. If you are still facing persistent issues, I would reach out to AWS Support by creating a support case, as they may be able to assist with your question further. Hope this helps!

Hi @Fotis_flex,

As this seems like something that would be very beneficial if a simplified process was introduced, I’ll mark this as a feature request to promote visibility to the support team.