Selecting latest folder from s3 bucket in dataset

I am creating a dashboard to publish the results of a job that runs daily. Each day’s results are published in an s3 bucket in a new folder.

The subfolder structure of the bucket is:

Main Bucket
├── run_date=2023-07-27
│   ├── result.json
│   └── _SUCCESS
├── run_date=2023-07-26
│   ├── result.json
│   └── _SUCCESS
  • I need only the latest run’s data to be present in the dataset.
  • I can schedule daily refresh on the dataset.
  • There is no data point inside result.json that can identify which day’s result it is.

I have 2 possible approaches:

  1. Some selection mechanism in manifest.json that reads only the required folder.
  2. Importing all the data in Main Bucket using URI Prefix and setting up some filtration mechanism on quicksight to remove old data.

Is either of these feasible?
How do I solve this?

Hi @sadashay,

Welcome to QuickSight Community!

One option that comes to mind is to create a Lambda that is executed each time a file is created in that bucket, this Lambda would grab the file and write it to a standard named key, e.g. current_results. This would be the file you have mapped to your dataset in QuickSight. On this same Lambda you could also trigger the refresh of the dataset so you only update the dataset when you recieve a new file.

You can get some ideas on how to set up this solution in this tutorial, which is used to create thumbnails of images but you could modify to copy the file you recieved.

Off course there might be many other options to do this using different services but I think that for you case it is the simplest one.

Hope this helps!

1 Like