Hey, I don’t directly work with quicksight but I’ve been asked to help with some automations around it. I’ve been reading the forums and docs to prepare but I thought I’d ask the community as well to help correct or confirm my current understanding.
Initial Request (Use existing analysis, themes, etc. on ephemeral daily snapshots)
The request was for analysis to be run from a daily restored RDS snapshot. This RDS instance would be terminated after the analysis was run.
Based off what I’ve read so far, you can’t just simply switch out data sources, so you need to use the API to recreate all of the analysis in the new data source, is this correct?
Better solutions?
I was thinking it might be simpler to just have a database that we have turned off for most of the day, but then before the jobs are run, we do something like an AWS DMS migration. That way we can keep the quicksight data source intact and have the data updated daily.
What do you all in the quicksight community typically do in similar scenarios? Are there any obvious solutions I’m missing?
Hi @Austin
I never did it on a daily bases but you should be able to replace a dataset within a analysis (best case the structure is the same, worst case it isn’t and you have to map the fields)
But about how many data are we talking? Any chance to extract files into S3 and use the files as a dataset?
May I ask why the RDS is terminated on a daily bases? Do you get new connection information every day?
Thanks @ErikG . The instance is terminated daily because they want to copy an existing Sql Server instance to run daily analysis. And they don’t want a direct connection from quicksight for security reasons. I would assume that is also why they don’t want to do a continuous replication with something like DMS either.
As for the S3 suggestion, I am not sure about that option. Does it make sense in the context of them wanting to run analysis on a sql server database? I would be interested to hear more if there is some way we can do this, but, as it stands now, I am not aware of it.
What they really want is to update the database associated with an existing datasource so it keeps the original datasets, I would assume that the analysis objects from those datasets would come for free after that?
Maybe we can use UpdateDatasource? That operation has a DataSourceParameters option that lets you select both a database and an instance id . And a Credentials object that lets you specify new creds, alternate datasources, or copy the credentials from an existing datasource.
I think I will test that next week, but if it’s not possible or there may be better solutions please share.
Hi @Austin - Thanks for posting the question. One of the solution will be dump the data to S3 and create an Athena table on top of that and connect Athena table to QuickSight. You have to introduce an additional pipeline and no need to change anything from QuickSight side.
The flow will look like
RDS → S3-> ( Athena Table) → QuickSight.
@Austin you can create a template from the analysis, and then use that template to launch the dashboard with different data sources.
1 create a template from the analysis
2 Create JSON for existing dataset ( custom query, etc)
Once you have that, use them as a “template” for launching a new dashboard
Some hints are in the article below. (different problem but similar solution)
I have used this to launch the same dashboard for different clients on demand.
make sure the security group for RDS for the new instance has permission for quicksight.
I used boto3 for that.
API call to create a data-source,
API call to create all datasets from that data-source
API call to create a dashboard with the analysis template (make sure you add all required datasets (created above)
Hi @Austin - I agreed with @ErikG, the update data set is the best option as I realized bring the data to S3, athena table may have some operational overhead. You should try that option first.
Hey, I tried the update data source operation but it failed. I am pretty sure it was because I did it in cloud shell and it complained it could not access the database. The error message seemed to imply the call was attempting to connect to the RDS instance.
I have two guesses as to what happened:
since cloud shell isn’t really attached to any vpc, and doesn’t have a security group you can whitelist, maybe it could not communicate with the database. (I do find it unlikely that the update data source operation makes a direct call to the database though, I’m just basing it on the error message I saw)
I feel like I’ve seen a post here about someone trying this same thing and running into problems because it seems that after updating the instance id in the RDS properties, the data source retained a reference to the IP address of the original data source. I think it’s possible only credentials are editable, and the instance id is only editable if the existing database has a name change .
Currently I’m waiting to get access to a cloud9 environment or a Bastion that can be whitelisted by the new database’s security group. I’ll try again and see if I get the same error .