Keeping track of dataset refreshees

Is there a way to keep track of dataset refreshes? Could I see at any given moment which datasets are currently refreshing, recently refreshed, etc.?

I believe it comes back in CloudTrail logs.

Specifically it comes back with the eventname of QueryDatabase.

2 Likes

You can also use the API to get the ingestion status (e.g. running, completed, scheduled, queued, etc).

1 Like

This lists only one dataset though, is there a way for me to view the status of all my datasets at once?

1 Like

No, I think you’ll have to write code to get a list of all your dataset IDs and iterate through all them to get the ingestion status for each.

1 Like

hi @ineedqshelp - Thanks for posting this question. David’s suggestion is good one, you can write a custom code using boto 3 quicksight API with below steps.

  1. Extract all data sets .
  2. Check the status
  3. Create a custom report which will have data set name and other refresh details.

If you will require any help in writing the code, let us know, we will give a try.

Regards - San

2 Likes

Thanks @Max . This is a lengthy process and lots of integration needs to be implement. It will be a good read for sure for all us. Thanks for sharing it.

Regards -San

1 Like

@David_Wong @Sanjeeb2022 Where would I write and implement the code? As a lambda function in the AWS console or are you referring to somewhere else?

2 Likes

Hi @ineedqshelp - There are 3 ways you can do it.

Option - 1: Lambda function - You can write your lambda function using boto3 and create the functions and extract the details and create an report in S3 and sent it via SES. However lambda has timeout of 15 mins, if your code need more time, it will fail.

Option -2: You can develop your script in python and can use cloudshell to run it and generate a report an S3 and using event trigger, can send the report via SES.

option -3: You can develop your code in any ec2 instance ( ensure right roles attached to instance) and configure python in ec2 and execute the code. This option is bit expensive, if in your eco system, if you already have ec2, you can create virtual python env and achieve this.

If you ask me, i will prefer option -1, which is easy, server less and no management require.

Regards - San

2 Likes

@Sanjeeb2022 @David_Wong @Max Do you have any resources I can use to get started? How would I write API calls and such in a lambda function? I haven’t done this before.

1 Like

Hi @ineedqshelp - For intro to lambda, please have a look on some materials first and comfortable with approach how you can call boto3 api’s in lambda ( python script). Some samples can be found from the below link.

  1. GitHub - grantcooksey/aws-lambda-python-examples: Lessons and examples to get started using aws lambda function in python
  2. AWS Lambda With Python: A Simple Introduction With Examples

If you are still finding difficulty, please let me, I will create a blog post for the same how you can call QuickSight API in lambda functions.

Regards - San

3 Likes

Hi @ineedqshelp - If the information or details are useful, can you marked the post as Solution for now. I know it will take some time for you to understand lambda, and develop python scripts for Lambda with boto3 API.

Regards - San

2 Likes

Thank you @ineedqshelp . If you are struggling in any issues with AWS ( particular in data related services like s3, rds, emr, glue, lambda), connect with me, will happy to help you. For QuickSight this community is really good .

Keep learning, keep growing…

Regards - San

2 Likes