I overwrote a csv file in s3 bucket(existing), then refreshed SPICE, but it doesn't pickup every records in CSV, why?

I had a csv file in S3 and loaded to QuickSight using manifest file. The file contains 3M rows.
Then, ETL uploaded the new file with the same name and same path. So, only file size got big.
CSV file now contains 4M rows.

Once I refreshed SPICE from the data-set page (full refresh,manual) , I noticed that the record count got a little bigger but not matching 4M at all. It’s more like 3.2M rows.

No matter how many time I refreshed , the number is not changing.

I don’t know why full refresh of SPICE is not working as expected.
Again, file got overwritten. Nothing else was changed. Is there any other step , I should do to reflect this?

Hi @tbdori - Is the Spice refresh status is completed? Just check whether SPICE skipped any records or not. can you also see whether you have sufficient space available in SPICE or not.

In parallel, please take the ingestion ID for the SPICE refresh and raise a ticket to AWS Customer support team so that they can analyze the logs from back end. To raise a ticket to AWS, please follow the link - Creating support cases and case management - AWS Support.

Regards - Sanjeeb

2 Likes

I found the issue. My manifest file is defined as below.

"globalUploadSettings": {
		"format": "CSV", 
		"delimiter": "|",
		"textqualifier": "\"",
		"containsHeader": "true"
	} 

(I know it’s not CSV but I just use that way)
And several records in source file have something like this

…|"TPSMA30-4HE3_B/I"|“352”|…

In DB table, this column is something like TPSMA30-4HE3_B/I\ then, ETL is wrapping all columns with double quote. Then, pipe delimits the columns. But, that trailing slash is voiding the quote then QuikcSight thinks the column never ended.

I think there are two ways to fix this.

  1. fix in the source file side.
  2. change manifest file so QuickSight can understand.

My question re. 2) option. Can it handle two character as the delimiter? Like %@, which I presume it will not happen in Data side. That way, I don’t need to use quote as wrapper.

@Max @Thomas @thomask @Bhasi_Mehta @apjvinod @Kristin @Tatyana_Yakushev @David_Wong @Biswajit_1993

Hi @tbdori

No I don’t believe it can handle two delimiters.

I can mark it as a feature request, although I would suggest trying to alter the source data as much as possible so that it only needs one in general.

I’ll mark it as a feature request. Thanks!