Turn off UUID in sql statements so that Performance insights can aggregate

Quicksight adds a comment with a unique identifier to sql queries. e.g.
/* QuickSight 01456d7b-af4b-4e6f-8fed-30cd97968bdd */ which means that Performance insights fails to aggregate. Is there a way to turn theses comments off?

Hello @eweb, the unique identifiers are added as comments to the query text by design so it cannot be turned off. Can you please elaborate on what performance insights do you refer to?

AWS RDS performance insights. It groups similar queries and gives averages. But if all queries are distinct then it is harder to see where performance problems lie. e.g. if I am loading 6 datasets I’d like to see 6 aggregates. But what I see is each statement separately, as below

You can see aggregated metrics like IngestionInvocationCount, IngestionLatency and IngestionRowCount per data-set in your CloudWatch. For more information see monitoring data in Amazon QuickSight. Let me know if this what you are looking for.

No it isn’t. We are looking to utilize aws rds performance insights. The unique identifiers in the queries breaks that.

Hi @eweb -

No way to turn this off in QuickSight.

Do you have api / programmatic access? You could use boto3 and python to do this.

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/pi.html

Example:

import boto3
import re
from datetime import datetime, timedelta

# Function to strip UUID from QuickSight SQL comments
def strip_uuid_from_query(sql):
    return re.sub(r'/\* QuickSight [0-9a-fA-F-]+ \*/', '', sql)

# Initialize a session using Amazon RDS
client = boto3.client('pi')

# Define your parameters
db_identifier = 'your-db-identifier'
start_time = datetime.utcnow() - timedelta(hours=1)
end_time = datetime.utcnow()
period_in_seconds = 60

# Define the metric query
metric_query = {
    "Metric": "db.sql_tokenized.avg",
    "GroupBy": {
        "Group": "db.sql_tokenized.query",
        "Limit": 10
    }
}

# Get the performance insights data
response = client.get_resource_metrics(
    ServiceType='RDS',
    Identifier=db_identifier,
    StartTime=start_time,
    EndTime=end_time,
    PeriodInSeconds=period_in_seconds,
    MetricQueries=[metric_query]
)

# Process the response
aggregated_data = {}
for metric in response['MetricList']:
    for datapoint in metric['DataPoints']:
        query = strip_uuid_from_query(metric['Key']['Dimensions']['db.sql_tokenized.query'])
        timestamp = datapoint['Timestamp']
        value = datapoint['Value']

        if query not in aggregated_data:
            aggregated_data[query] = []
        aggregated_data[query].append((timestamp, value))

# Print the aggregated data
for query, data_points in aggregated_data.items():
    print(f"Query: {query}")
    for timestamp, value in data_points:
        print(f"  {timestamp}: {value}")

1 Like

Hi @eweb,
It’s been awhile since we last heard from you. Were you able to try Rob’s last suggestion or find a different work around for your solution? If not, did you have any additional questions?

If we do not hear back within the next 3 business days, I’ll go ahead and close out this topic.

Thank you!

1 Like

We use Performance insights to monitor and troubleshooting, the work around won’t work for us. But thanks for the suggestion. It would be nice if your approach was compatible with other AWS products. Feel free to close, but perhaps something for your team to consider in the future.

1 Like

Thanks for the response and the suggestion! I’ll mark this as a feature request for make visible to our support team!