Import fails using SPICE because dataset is too large. How can I limit dataset size and then incrementally import rows every hour?

How can I limit the dataset size so that it can import…say the first 300k rows, and then import x amount of rows every hour or so?

@hasanm Could you please explain further about your case? You would like to ingest 300k rows. And then, you would like to incrementally ingest X rows per hour. How about the original 300k rows? Do you want to keep them, or delete, update them? In total, how many rows you would like to keep in the SPICE?

Im trying to load the entire dataset, which seems to be over 1,000,000,000 rows (hence the import failure). Not sure what’s the best way to do this. But my idea was to query in the first 300k lines from redshift into QuickSight using SPICE (This way the import wont fail), then append the next 300k lines to the dataset every hour. However I have no clue how to tell SPICE to do the query during the import

@hasanm When you create a new dataset in Amazon QuickSight, SPICE limits the number of rows you can add to a dataset. For Enterprise edition, 500 million (500,000,000) rows or 500 GB for each dataset. Please check the documents:

So, if the total rows are over 1,000,000,000 rows. No matter one time query ingestion or incremental hourly ingestion, SPICE will throw the same error message once you hit the 500 million rows.

Please contact the account manager to discuss with SPICE team. @SiYin FYI.

At the same time, for incremental ingestion, you can create a timestamp column in your redshift tables called “updated_time”, set the datetime to be 1 hour difference per 30k rows. And then:
To incrementally refresh a SQL-based SPICE dataset

  1. On the Datasets page, choose the dataset, and then choose Refresh Now.
  2. For Refresh type, choose Incremental refresh.
  3. If this is your first incremental refresh on the dataset, choose Configure.
  4. On the Configure incremental refresh page, do the following:
  5. For Date column, choose a date column that you want to base the look-back window on, which is the “updated_time” column.
  6. For Window size, enter a number for size, and then choose an amount of time that you want to look back for changes. You can set it to be refresh per hour.
  7. Choose Submit.

Please refer to: Refreshing SPICE data - Amazon QuickSight