Using Jupyter Notebooks to create datasets (from existing datasets)

Hey everyone!

I would like to integrate some python features into my datasets and analysis. Before starting to work with Quicksight, I used to create datasets with SQL and then manipulate the data with Jupyter Notebooks for specific requests.

I would like to be able to do the same now, and I was wondering if it’s possible to use an already created dataset in Quicksight (including parameters and calculated fields) and use it in a Notebook. After manipulating the data using Python, I would like to publish it as a new dataset (using S3 I guess?)

I’ve been doing some research and I was not able to find a clear way to perform these operations.

I’ve also created another question regarding custom visuals with Python (matplotlib, seaborn, etc).

Thanks in advance!
Santiago

Hi @Santiago

you need to install AWS SDK for Python (Boto3):

https://docs.aws.amazon.com/pythonsdk/?icmpid=docs_homepage_sdktoolkits
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html

You can do it in your JupyterLab:

Once you have it install, you can then access all the Quicksight APIs. Many of them are about dataset manipulation:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/quicksight.html

I hope it helps,
GL

2 Likes

Hey @gillepa

Thanks you for your response.
I’m familiar with boto3, I’ve been conducting some tests with the Quicksight API recently. However, what I’m trying to achieve is obtaining the query results directly (the dataframe itself),rather than the details from the dataset as provided by describe_data_set.
I understand that one possible solution could involve using describe_data_set to retrieve the query code and then using any Python library to execute this query and start building from there. The drawback of this approach is that it would result in the loss of dataset parameters and calculated fields.

Is there another approach or method that allows me to obtain the query results while retaining the dataset parameters and calculated fields?

Thanks again,
Santiago

Hi @SantiagoZ

I get it now, thank you for explaining. The APIs allow you to change the metadata not the data itself. Currently you cannot download the data into your local machine, manipulate it outside of Quicksight and then upload it back into Quicksight.

Hope it helps,
GL