Data Import Error: SQL Schema Mismatch

TRube · April 8, 2024, 4:22pm

I am updating a quicksight dataset to change a column from an integer to a string (requester want Left Padded 0’s on the field).

The error I get is this:
Ingestion Id

223999cc-0d11-4556-b6e5-1a81ffe64cca

Error type:

SQL_SCHEMA_MISMATCH_ERROR Learn more

The data source schema doesn’t match the QuickSight dataset.

Error details:

The row input schema does not expect a string. Incorrect Column Type at Index 12 for Column BRANCH_NUMBER with DataprepLogicalType Int

A couple of details that might or might not help:

The datasource is a straight sql query import from snowflake, there are no calculated fields within the dataset itself. All the transformation is done within a view in snowflake, and the view there works fine (i.e. code is just “select * from view;”)

-The documentation says that I should be able to change something in the Edit Dataset module, but when I go to manually change the data type of that field, changing it to a string is not an option that presents itself.

There is no incremental refresh. It refreshes fully every time.
I created a brand new dataset with the same query and applied RLSS, and that works. I don’t want to do that as a solution, there are a few dozen dashboards that rely on this dataset an and I don’t want to have to change every one (only 2 use this particular field).
The field is not used in our RLSS scheme.
The field is used in filters in some dashboards, but I would expect changing the dataset to break the field in the analysis/dashboard, not break the dataset itself.
The dataset is SPICE

Does anyone know what I need to do to get quicksight to “forget” that the field used to be an integer?

edit: We didn’t specify an explicit cast for the field in SF, but SF shows it as a VARCHAR.

TRube · April 8, 2024, 4:44pm

I believe I found a solution, I’ll post it here for other people that run into this because it’s pretty unintuitive.

Here’s the steps:

Go to the Dataset
Click Edit Dataset
From within the Edit Dataset screen, click the three dots next to the field that has the error, (in this case Branch_Number) and click Exclude Field
Save and Publish the dataset, and let it load.
Then, after that is complete, go back to the Edit Dataset screen, go to Excluded Fields, and click the three dots next to the field you manually excluded and click Include Field.
Save and Publish and it should now work.

I’m assuming that overwriting the dataset with a dataset that excludes the problem field resets the structure of the dataset so that field doesn’t have any history in terms of what format QS is expecting. So when you reinclude it, it will include it with the new data type.