Using Jupyter Notebooks to create datasets (from existing datasets)

SantiagoZ · July 10, 2023, 7:19pm

Hey everyone!

I would like to integrate some python features into my datasets and analysis. Before starting to work with Quicksight, I used to create datasets with SQL and then manipulate the data with Jupyter Notebooks for specific requests.

I would like to be able to do the same now, and I was wondering if it’s possible to use an already created dataset in Quicksight (including parameters and calculated fields) and use it in a Notebook. After manipulating the data using Python, I would like to publish it as a new dataset (using S3 I guess?)

I’ve been doing some research and I was not able to find a clear way to perform these operations.

I’ve also created another question regarding custom visuals with Python (matplotlib, seaborn, etc).

Thanks in advance!
Santiago

gillepa · July 11, 2023, 1:55pm

Hi @Santiago

you need to install AWS SDK for Python (Boto3):

https://docs.aws.amazon.com/pythonsdk/?icmpid=docs_homepage_sdktoolkits
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html

You can do it in your JupyterLab:

Once you have it install, you can then access all the Quicksight APIs. Many of them are about dataset manipulation:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/quicksight.html

I hope it helps,
GL

SantiagoZ · July 11, 2023, 7:00pm

Hey @gillepa

Thanks you for your response.
I’m familiar with boto3, I’ve been conducting some tests with the Quicksight API recently. However, what I’m trying to achieve is obtaining the query results directly (the dataframe itself),rather than the details from the dataset as provided by describe_data_set.
I understand that one possible solution could involve using describe_data_set to retrieve the query code and then using any Python library to execute this query and start building from there. The drawback of this approach is that it would result in the loss of dataset parameters and calculated fields.

Is there another approach or method that allows me to obtain the query results while retaining the dataset parameters and calculated fields?

Thanks again,
Santiago

gillepa · July 14, 2023, 5:54pm

Hi @SantiagoZ

I get it now, thank you for explaining. The APIs allow you to change the metadata not the data itself. Currently you cannot download the data into your local machine, manipulate it outside of Quicksight and then upload it back into Quicksight.

Hope it helps,
GL

Topic		Replies	Views
Connect to quicksight dataset using python Q&A quick-sight	4	2553	June 26, 2023
Manipulating datasets using Quick Sight API Q&A api , developer , quick-sight , dataset , how-to	2	338	February 6, 2023
How to use Quick Sight API to fetch data of dataset in AWS Quick Sight which is imported by AWS S3 Q&A data-source , quick-sight , dataset , Business-Intelligence-Engineer	2	257	May 24, 2024
Pulling Quick Sight Dataset Using Python Q&A api , developer , quick-sight , SDK	1	613	January 16, 2024
Deploy dataset creation from Github Q&A feature-request , quick-sight	3	1226	May 11, 2023

Using Jupyter Notebooks to create datasets (from existing datasets)

Related topics