Display NULL for empty strings

Fotis_flex · November 24, 2023, 9:49am

Hey all,

I am using the query below to unload a Redshift table to S3 and then ingest it to QS. While the table in RS contain NULL’s in fields of VARCHAR data type, QS imports those records as “empty” instead of NULL. How can I ingest the S3 .csv to QS and having NULL’s instead of empty strings in QS?
I know there is a parameter to specify in the COPY command, called EMPTYASNULL, but I don’t see there is one for the UNLOAD command. Any ideas?

unload_query = f"""
      unload ('select * from public.{REPORT_NAME}')
      to '{report_dir}'
      iam_role '{IAM_ROLE}'
      header parallel off ALLOWOVERWRITE DELIMITER ',' ADDQUOTES ESCAPE MANIFEST VERBOSE;
    """

Sanjeeb2022 · November 24, 2023, 4:32pm

Hi @Fotis_flex - One of the solution can be done at QuickSight side to change the empty value to NULL. Please see the below link - How to replace empty fields

Regards - Sanjeeb

Fotis_flex · November 27, 2023, 8:25am

@Sanjeeb2022 Hmmm, ok, I had this in mind but that requires to create calculated fields. I have a dataset with > 100 fields and many of those are strings with “empty” instead of Null. Having to create so many calculated fields would complicate the dataset along. Exposing a field as a duplicated one just so it displays “empty” as Null is not efficient. Isn’t there any other way to make QS interpret “empty” as Null? I hope you understand my concern.

royyung · November 27, 2023, 2:46pm

Can use Athena as a data source and write SQL to convert empty to NULL. Like below example. I am using a case condition to convert empty to NULL
select
case col1
when ‘’ then NULL
else col1
END as “ID”,
from tableA

Sanjeeb2022 · November 27, 2023, 4:00pm

Hi @Fotis_flex - I agree with you, its not a good idea of creating 100’s calculated field. Is it possible to custom sql and explicitly convert empty to NULL ( as work around).

In parallel we will highlight this to QuickSight team so that this can be added as a change in coming days in QS as enhancement.

Hi @Koushik_Muthanna @royyung - Can you please help on this?

Regards - Sanjeeb

Fotis_flex · November 28, 2023, 8:44am

Hi @Sanjeeb2022 . All the problems I face with ingesting data to QS lies upon the fact that the Redshift tables are unloaded to S3 as .csv with the appropriate data types (NULL’s for e.g instead of empty strings) but QS cannot interpret them as it should and converts NULL’s to empty strings. I have similar issues with Date data types. For e.g I have a redshift table with a timestamp field taking distinct values either NULL or “1901-31-12 00:00:00” and QS cannot ingest this field as Date, it only accepts it as string. I am looking at the documentation and I can’t find all solutions to my problems. But many thanks for support. @Sanjeeb2022 , I also have some additional info on the “DATA TOLERANCE ERROR” issue but I can’t add it on the comments!

duncan · December 19, 2023, 6:05pm

Hello @Fotis_flex , @Sanjeeb2022 and @royyung !

@Fotis_flex was @Sanjeeb2022 or @royyung’s suggestions helpful in finding a solution or were you able to find a workaround for this problem? If so could you mark one of their comments as the solution or post your solution to help the community?

Ultimately this sounds like a feature request.