Need to export 10 million rows * 5 columns from dashboard

Hello Quicksight team,

This is a very critical issue for me.

Following is the query:

  • I want to export data from my dashboard visual which has ~ 10 million rows * 5-10 columns
  • Quicksight is limiting the export to only 1 million rows

I need to export the data at any cost.
Help needed…

Thanks in advance,
Ajinkya

Hi,

How are you ?

Here is SPICE limit, its not 1 million.

Regards,

Naveed Ali

Hi Naveed,

My spice is working fine,
I have imported approx 25 million rows to my dataset using spice

I am filtering the data and applying some condition in the visual itself.
The resultant visual has 10 million rows *5 columns

I am exporting the result to csv, but it is not working:
image

Thanks,
Ajinkya

1 Like

Hi,

This is the limitation. :slight_smile:

In this regards,

For table charts, Amazon QuickSight supports exporting up to 1 million rows or 500 MB of data, whichever limit is reached first.

Regards
Naveed Ali

Any other work-around that is possible?

If I am not able to download these rows then unfortunately I will have to move this dashboard some other platform.

Hi,

No way I think :slight_smile:

Regards
Naveed Ali

Hi @ajinkya_ghodake
what datasoure are you working with?
BR

I am importing a table directly from Snowflake.

@ajinkya_ghodake
How are you planning to consume the data after you export it? Excel can’t handle files with 10 million rows. The limit in Excel is the same as the limit in QuickSight - 1 million.

Hi @ajinkya_ghodake - Thanks for posting the question. This is an interesting usecase. If you are exporting the data as csv file ( which is a tabular format), you do not need QuickSight to be used as a Data Extraction tool. It is better to shift this workload to Snowflake as an data extract job. This way you can export the data quickly. Exporting 1M rows from a reporting or visualization tool is not recommended and I agree with @David_Wong point.

Regards - Sanjeeb

The viewers of the dashboard use tilters to select the right segment of data and then export it as csv.

Later this data is being consumed into python, these files will never be opened by anyone directly.

Basically, this dashboard is being used by large set of ML engineers to get segments of data to train models for different use case.

I have another approach where I am dumping the data into s3 bucket and have given consumers access to that data.

But the problem is that people cannot filter the data and export it to csv and later feed it directly to their code.

They have to download it, read it in python, create segments and then pass it to the main code.

Hi @ajinkya_ghodake
couldnt be a better solution to export the data directly out of the Snowflake with the Python script?
BR

They don’t have sufficient privileges to access data.

All the data should go through S3 bucket or Quicksight.

Hi @ajinkya_ghodake - When you are mentioning user needs to do the filter and then export to CSV, Is the filter is fixed? if yes, then possibly you can create a data pipeline using glue which will read the data from S3, put the filter as per user and create files in CSV for the user . However this will require a some amount of data engineering work. One of another approach will be purely custom solution and having an interface where user can provide the filter and you can run custom pyspark and generate csv files for them. This is require a bit of coding.

Regards - Sanjeeb

Created a stored procedure that takes the filters as parameters.

Tested it and seems functional

1 Like