Conversion Guide: Amazon Quick Topics → Dataset Q&A

Conversion Guide

Reuse your Topics curation with Dataset Q&A.

Executive Summary

Amazon Quick offers two natural-language query experiences:

  • Topics are the original Q&A capability. Topics define a curated subject area — grouping one or more datasets with field semantics and business context — that scopes what users can ask about. Topics work well for predictable question patterns and continue to play a role in organizing how datasets are grouped together for Q&A.
  • Dataset Q&A is the newer capability, built on a modern text-to-SQL architecture that queries the full dataset directly. It supports runtime calculations, respects row-level and column-level security (RLS/CLS), scales to 2 billion rows, and delivers significantly higher query accuracy. Every answer includes an Explanation showing the data source(s) used, assumptions made, and the generated SQL query for full transparency.

Our recommendation: Ensure your business context lives at the dataset level via Dataset Enrichment. If you've already curated Topics, the conversion script below lets you extract that investment and apply it to your datasets. Continue to maintain your Topics as well — they remain useful for grouping datasets into scoped subject areas.

This guide shows how to convert your existing Topics metadata into Dataset Enrichment format, so your curation investment applies directly to Dataset Q&A.

Conversion Steps

Phase 1: Identify Topics to Convert

Not every topic needs to be converted immediately. Start with the ones where Dataset Q&A would deliver the biggest improvement.

Check your Topics performance: Open any topic and navigate to the Performance & Feedback tab to review usage patterns, unanswered questions, and feedback scores.

Prioritize topics where:

  • Users frequently get "I can't answer that" responses
  • Calculated fields are complex or frequently updated
  • Feedback scores and/or usage are low despite curation effort
  • Users need to ask questions that span beyond the pre-configured field scope

Keep as-is for now: Topics with high satisfaction scores and predictable question patterns — these are working well and don't need immediate attention.

Phase 2: Convert Topics to Dataset Enrichment

Use the following script to automatically extract all curated metadata from your Topics via the DescribeTopic and ListTopics APIs and output it as a file you can upload directly to Dataset Enrichment.

Prerequisites:

  • Python 3.8+
  • AWS credentials configured with quicksight:ListTopics and quicksight:DescribeTopic permissions
  • boto3 installed (pip install boto3)
  • Your AWS account ID

Usage:

# Export all topics to YAML (default)
python convert_topics_to_enrichment.py --account-id 123456789012

# Export a specific topic to JSON
python convert_topics_to_enrichment.py --account-id 123456789012 --topic-id my-topic-id --format json

# Export to plain text (human-readable)
python convert_topics_to_enrichment.py --account-id 123456789012 --format txt

# Specify AWS region
python convert_topics_to_enrichment.py --account-id 123456789012 --region us-west-2

The script creates one enrichment file per topic in your chosen format:

  • --format yaml (default): <topic-name>_enrichment.yaml
  • --format json: <topic-name>_enrichment.json
  • --format txt: <topic-name>_enrichment.txt

Example output (YAML)

topic_name: "Sales Analytics"
description: "Quarterly sales performance tracking"
custom_instructions: "Always use fiscal quarter boundaries. Revenue should be reported in USD."
datasets:
  - dataset_name: "sales_transactions"
    dataset_description: "All completed sales transactions"
    fields:
      - column_name: "revenue"
        friendly_name: "Total Revenue"
        description: "Net revenue after discounts and returns"
        synonyms:
          - "sales"
          - "income"
          - "earnings"
        data_role: "MEASURE"
        default_aggregation: "SUM"
        formatting:
          display_format: "CURRENCY"
          currency_symbol: "$"
          fraction_digits: 2
          use_grouping: true
      - column_name: "region"
        friendly_name: "Sales Region"
        description: "Geographic sales territory"
        synonyms:
          - "territory"
          - "area"
          - "geo"
        data_role: "DIMENSION"
        default_aggregation: "COUNT"
        cell_value_synonyms:
          - cell_value: "NA"
            synonyms: ["North America", "US and Canada"]
          - cell_value: "EMEA"
            synonyms: ["Europe", "EU"]
    calculated_fields:
      - field_name: "profit_margin"
        description: "Net profit as percentage of revenue"
        expression: "(revenue - cost) / revenue * 100"
        synonyms:
          - "margin"
          - "profitability"
        default_aggregation: "AVG"
    named_entities:
      - entity_name: "Top Performer"
        description: "Sales rep with highest quarterly revenue"
        synonyms:
          - "best seller"
          - "top rep"

What gets extracted

The script converts the following Topic metadata into Dataset Enrichment format:

Topic configurationDataset Enrichment equivalent
Field synonymssynonyms list in enrichment file
Friendly field namesdescription field
Semantic types (currency, date, etc.)format or notes field
Calculated fieldscustom_instructions with formulas or directly create a calculated field
Topic descriptiondataset_description
Default aggregationsAggregation hints
Display formattingFormatting rules
Cell value synonymsValue mappings
Named entitiesSemantic field groupings
Comparative ordering (greater-is-better, lesser-is-better)Custom instructions (e.g., "For revenue, higher is better. For defect rate, lower is better.")

Using the output

Upload the generated enrichment file directly to your dataset. Navigate to your dataset in Amazon Quick, open Dataset details, and use the Upload file option under Custom instructions. Supported formats: CSV, TXT, JSON, YAML, YML.

The uploaded file automatically applies your curated context — synonyms, descriptions, formatting rules, and business logic — to every Dataset Q&A query against that dataset. No additional configuration needed.

You can also use the output as reference to manually configure individual fields:

  1. Field descriptions — Copy into your dataset field descriptions via the Quick console or the UpdateDataSet API
  2. Custom instructions — Paste directly into the Custom instructions text box for quick edits
  3. Calculated fields — Recreate as dataset calculated fields if you want them available as columns

Full script: convert_topics_to_enrichment.py

Click to expand
#!/usr/bin/env python3
"""
Convert Amazon Quick Topics metadata to Dataset Enrichment files.

Extracts all curated metadata from Topics (synonyms, field descriptions,
friendly names, default aggregations, formatting, custom instructions, etc.)
and outputs structured files (YAML, JSON, or TXT) for use with data prep.

Usage:
    python convert_topics_to_enrichment.py --account-id 123456789012
    python convert_topics_to_enrichment.py --account-id 123456789012 --topic-id my-topic --format json
    python convert_topics_to_enrichment.py --account-id 123456789012 --format txt --region us-west-2
"""

import argparse
import json
import sys
from pathlib import Path

try:
    import boto3
except ImportError:
    print("Error: boto3 is required. Install with: pip install boto3")
    sys.exit(1)

try:
    import yaml
    HAS_YAML = True
except ImportError:
    HAS_YAML = False


def get_quicksight_client(region=None):
    """Create a QuickSight boto3 client."""
    kwargs = {}
    if region:
        kwargs["region_name"] = region
    return boto3.client("quicksight", **kwargs)


def list_topics(client, account_id):
    """List all topics in the account."""
    topics = []
    paginator_token = None

    while True:
        kwargs = {"AwsAccountId": account_id, "MaxResults": 100}
        if paginator_token:
            kwargs["NextToken"] = paginator_token

        response = client.list_topics(**kwargs)
        topics.extend(response.get("TopicsSummaries", []))

        paginator_token = response.get("NextToken")
        if not paginator_token:
            break

    return topics


def describe_topic(client, account_id, topic_id):
    """Describe a single topic and return full metadata."""
    response = client.describe_topic(AwsAccountId=account_id, TopicId=topic_id)
    return response


def extract_formatting(default_formatting):
    """Extract display formatting details from a column or calculated field."""
    if not default_formatting:
        return None

    formatting = {}
    display_format = default_formatting.get("DisplayFormat")
    if display_format:
        formatting["display_format"] = display_format

    options = default_formatting.get("DisplayFormatOptions", {})
    if options:
        if options.get("CurrencySymbol"):
            formatting["currency_symbol"] = options["CurrencySymbol"]
        if options.get("DateFormat"):
            formatting["date_format"] = options["DateFormat"]
        if options.get("DecimalSeparator"):
            formatting["decimal_separator"] = options["DecimalSeparator"]
        if options.get("FractionDigits") is not None:
            formatting["fraction_digits"] = options["FractionDigits"]
        if options.get("GroupingSeparator"):
            formatting["grouping_separator"] = options["GroupingSeparator"]
        if options.get("Prefix"):
            formatting["prefix"] = options["Prefix"]
        if options.get("Suffix"):
            formatting["suffix"] = options["Suffix"]
        if options.get("UnitScaler"):
            formatting["unit_scaler"] = options["UnitScaler"]
        if options.get("UseGrouping") is not None:
            formatting["use_grouping"] = options["UseGrouping"]
        if options.get("UseBlankCellFormat") is not None:
            formatting["use_blank_cell_format"] = options["UseBlankCellFormat"]
        if options.get("BlankCellFormat"):
            formatting["blank_cell_format"] = options["BlankCellFormat"]

        negative = options.get("NegativeFormat", {})
        if negative:
            formatting["negative_format"] = {
                k.lower(): v for k, v in negative.items() if v
            }

    return formatting if formatting else None


def extract_comparative_order(comp_order):
    """Extract comparative ordering details."""
    if not comp_order:
        return None

    result = {}
    if comp_order.get("UseOrdering"):
        result["use_ordering"] = comp_order["UseOrdering"]
    if comp_order.get("SpecifedOrder"):
        result["specified_order"] = comp_order["SpecifedOrder"]
    if comp_order.get("TreatUndefinedSpecifiedValues"):
        result["treat_undefined"] = comp_order["TreatUndefinedSpecifiedValues"]

    return result if result else None


def extract_semantic_type(semantic_type):
    """Extract semantic type information."""
    if not semantic_type:
        return None

    result = {}
    if semantic_type.get("TypeName"):
        result["type_name"] = semantic_type["TypeName"]
    if semantic_type.get("SubTypeName"):
        result["sub_type_name"] = semantic_type["SubTypeName"]
    if semantic_type.get("TruthyCellValue"):
        result["truthy_cell_value"] = semantic_type["TruthyCellValue"]
    if semantic_type.get("TruthyCellValueSynonyms"):
        result["truthy_cell_value_synonyms"] = semantic_type["TruthyCellValueSynonyms"]
    if semantic_type.get("FalseyCellValue"):
        result["falsey_cell_value"] = semantic_type["FalseyCellValue"]
    if semantic_type.get("FalseyCellValueSynonyms"):
        result["falsey_cell_value_synonyms"] = semantic_type["FalseyCellValueSynonyms"]
    if semantic_type.get("TypeParameters"):
        result["type_parameters"] = semantic_type["TypeParameters"]

    return result if result else None


def extract_column(column):
    """Extract metadata from a single topic column."""
    col_data = {
        "column_name": column.get("ColumnName", ""),
    }

    if column.get("ColumnFriendlyName"):
        col_data["friendly_name"] = column["ColumnFriendlyName"]
    if column.get("ColumnDescription"):
        col_data["description"] = column["ColumnDescription"]
    if column.get("ColumnSynonyms"):
        col_data["synonyms"] = column["ColumnSynonyms"]
    if column.get("ColumnDataRole"):
        col_data["data_role"] = column["ColumnDataRole"]
    if column.get("Aggregation"):
        col_data["default_aggregation"] = column["Aggregation"]
    if column.get("AllowedAggregations"):
        col_data["allowed_aggregations"] = column["AllowedAggregations"]
    if column.get("NotAllowedAggregations"):
        col_data["not_allowed_aggregations"] = column["NotAllowedAggregations"]
    if column.get("NeverAggregateInFilter") is not None:
        col_data["never_aggregate_in_filter"] = column["NeverAggregateInFilter"]
    if column.get("NonAdditive") is not None:
        col_data["non_additive"] = column["NonAdditive"]
    if column.get("TimeGranularity"):
        col_data["time_granularity"] = column["TimeGranularity"]
    if column.get("IsIncludedInTopic") is not None:
        col_data["is_included_in_topic"] = column["IsIncludedInTopic"]

    cell_synonyms = column.get("CellValueSynonyms", [])
    if cell_synonyms:
        col_data["cell_value_synonyms"] = [
            {
                "cell_value": cs.get("CellValue", ""),
                "synonyms": cs.get("Synonyms", []),
            }
            for cs in cell_synonyms
        ]

    formatting = extract_formatting(column.get("DefaultFormatting"))
    if formatting:
        col_data["formatting"] = formatting

    comp_order = extract_comparative_order(column.get("ComparativeOrder"))
    if comp_order:
        col_data["comparative_order"] = comp_order

    semantic = extract_semantic_type(column.get("SemanticType"))
    if semantic:
        col_data["semantic_type"] = semantic

    return col_data


def extract_calculated_field(field):
    """Extract metadata from a calculated field."""
    field_data = {
        "field_name": field.get("CalculatedFieldName", ""),
    }

    if field.get("CalculatedFieldDescription"):
        field_data["description"] = field["CalculatedFieldDescription"]
    if field.get("Expression"):
        field_data["expression"] = field["Expression"]
    if field.get("CalculatedFieldSynonyms"):
        field_data["synonyms"] = field["CalculatedFieldSynonyms"]
    if field.get("ColumnDataRole"):
        field_data["data_role"] = field["ColumnDataRole"]
    if field.get("Aggregation"):
        field_data["default_aggregation"] = field["Aggregation"]
    if field.get("AllowedAggregations"):
        field_data["allowed_aggregations"] = field["AllowedAggregations"]
    if field.get("NotAllowedAggregations"):
        field_data["not_allowed_aggregations"] = field["NotAllowedAggregations"]
    if field.get("NeverAggregateInFilter") is not None:
        field_data["never_aggregate_in_filter"] = field["NeverAggregateInFilter"]
    if field.get("NonAdditive") is not None:
        field_data["non_additive"] = field["NonAdditive"]
    if field.get("TimeGranularity"):
        field_data["time_granularity"] = field["TimeGranularity"]
    if field.get("IsIncludedInTopic") is not None:
        field_data["is_included_in_topic"] = field["IsIncludedInTopic"]

    cell_synonyms = field.get("CellValueSynonyms", [])
    if cell_synonyms:
        field_data["cell_value_synonyms"] = [
            {
                "cell_value": cs.get("CellValue", ""),
                "synonyms": cs.get("Synonyms", []),
            }
            for cs in cell_synonyms
        ]

    formatting = extract_formatting(field.get("DefaultFormatting"))
    if formatting:
        field_data["formatting"] = formatting

    comp_order = extract_comparative_order(field.get("ComparativeOrder"))
    if comp_order:
        field_data["comparative_order"] = comp_order

    semantic = extract_semantic_type(field.get("SemanticType"))
    if semantic:
        field_data["semantic_type"] = semantic

    return field_data


def extract_named_entity(entity):
    """Extract metadata from a named entity."""
    entity_data = {
        "entity_name": entity.get("EntityName", ""),
    }

    if entity.get("EntityDescription"):
        entity_data["description"] = entity["EntityDescription"]
    if entity.get("EntitySynonyms"):
        entity_data["synonyms"] = entity["EntitySynonyms"]

    sem_type = entity.get("SemanticEntityType", {})
    if sem_type:
        if sem_type.get("TypeName"):
            entity_data["type_name"] = sem_type["TypeName"]
        if sem_type.get("SubTypeName"):
            entity_data["sub_type_name"] = sem_type["SubTypeName"]
        if sem_type.get("TypeParameters"):
            entity_data["type_parameters"] = sem_type["TypeParameters"]

    definitions = entity.get("Definition", [])
    if definitions:
        entity_data["definitions"] = []
        for defn in definitions:
            d = {}
            if defn.get("FieldName"):
                d["field_name"] = defn["FieldName"]
            if defn.get("PropertyName"):
                d["property_name"] = defn["PropertyName"]
            if defn.get("PropertyRole"):
                d["property_role"] = defn["PropertyRole"]
            if defn.get("PropertyUsage"):
                d["property_usage"] = defn["PropertyUsage"]
            metric = defn.get("Metric", {})
            if metric:
                d["metric"] = {
                    "aggregation": metric.get("Aggregation", ""),
                }
                if metric.get("AggregationFunctionParameters"):
                    d["metric"]["parameters"] = metric["AggregationFunctionParameters"]
            if d:
                entity_data["definitions"].append(d)

    return entity_data


def extract_filter(topic_filter):
    """Extract metadata from a topic filter."""
    filter_data = {
        "filter_name": topic_filter.get("FilterName", ""),
    }

    if topic_filter.get("FilterDescription"):
        filter_data["description"] = topic_filter["FilterDescription"]
    if topic_filter.get("FilterSynonyms"):
        filter_data["synonyms"] = topic_filter["FilterSynonyms"]
    if topic_filter.get("FilterType"):
        filter_data["filter_type"] = topic_filter["FilterType"]
    if topic_filter.get("FilterClass"):
        filter_data["filter_class"] = topic_filter["FilterClass"]
    if topic_filter.get("OperandFieldName"):
        filter_data["operand_field_name"] = topic_filter["OperandFieldName"]

    return filter_data


def convert_topic(topic_response):
    """Convert a full DescribeTopic response into enrichment format."""
    topic = topic_response.get("Topic", {})
    topic_id = topic_response.get("TopicId", "")

    enrichment = {
        "topic_id": topic_id,
        "topic_name": topic.get("Name", ""),
    }

    if topic.get("Description"):
        enrichment["description"] = topic["Description"]

    custom_instructions = topic_response.get("CustomInstructions", {})
    if custom_instructions and custom_instructions.get("CustomInstructionsString"):
        enrichment["custom_instructions"] = custom_instructions["CustomInstructionsString"]

    datasets = topic.get("DataSets", [])
    if datasets:
        enrichment["datasets"] = []
        for ds in datasets:
            ds_data = {
                "dataset_arn": ds.get("DatasetArn", ""),
            }
            if ds.get("DatasetName"):
                ds_data["dataset_name"] = ds["DatasetName"]
            if ds.get("DatasetDescription"):
                ds_data["dataset_description"] = ds["DatasetDescription"]

            data_agg = ds.get("DataAggregation", {})
            if data_agg:
                ds_data["data_aggregation"] = {
                    "date_granularity": data_agg.get("DatasetRowDateGranularity", ""),
                    "default_date_column": data_agg.get("DefaultDateColumnName", ""),
                }

            columns = ds.get("Columns", [])
            if columns:
                ds_data["fields"] = [extract_column(col) for col in columns]

            calc_fields = ds.get("CalculatedFields", [])
            if calc_fields:
                ds_data["calculated_fields"] = [
                    extract_calculated_field(f) for f in calc_fields
                ]

            named_entities = ds.get("NamedEntities", [])
            if named_entities:
                ds_data["named_entities"] = [
                    extract_named_entity(e) for e in named_entities
                ]

            filters = ds.get("Filters", [])
            if filters:
                ds_data["filters"] = [extract_filter(f) for f in filters]

            enrichment["datasets"].append(ds_data)

    return enrichment


def write_yaml(enrichment, output_path):
    """Write enrichment data as YAML."""
    if not HAS_YAML:
        print("Error: PyYAML is required for YAML output. Install with: pip install pyyaml")
        print("Falling back to JSON format.")
        output_path = output_path.with_suffix(".json")
        write_json(enrichment, output_path)
        return

    with open(output_path, "w") as f:
        yaml.dump(enrichment, f, default_flow_style=False, allow_unicode=True, sort_keys=False, width=120)
    print(f"  Written: {output_path}")


def write_json(enrichment, output_path):
    """Write enrichment data as JSON."""
    with open(output_path, "w") as f:
        json.dump(enrichment, f, indent=2, ensure_ascii=False)
    print(f"  Written: {output_path}")


def write_txt(enrichment, output_path):
    """Write enrichment data as human-readable plain text."""
    lines = []
    lines.append(f"Topic: {enrichment.get('topic_name', 'Unknown')}")
    lines.append(f"Topic ID: {enrichment.get('topic_id', '')}")
    if enrichment.get("description"):
        lines.append(f"Description: {enrichment['description']}")
    if enrichment.get("custom_instructions"):
        lines.append(f"\nCustom Instructions:\n  {enrichment['custom_instructions']}")

    for ds in enrichment.get("datasets", []):
        lines.append(f"\n{'=' * 80}")
        lines.append(f"Dataset: {ds.get('dataset_name', ds.get('dataset_arn', ''))}")
        if ds.get("dataset_description"):
            lines.append(f"  Description: {ds['dataset_description']}")
        lines.append("\n  Fields:\n  " + "-" * 60)
        for field in ds.get("fields", []):
            lines.append(f"    {field['column_name']}")
            if field.get("friendly_name"):
                lines.append(f"      Friendly Name: {field['friendly_name']}")
            if field.get("description"):
                lines.append(f"      Description: {field['description']}")
            if field.get("synonyms"):
                lines.append(f"      Synonyms: {', '.join(field['synonyms'])}")
            if field.get("data_role"):
                lines.append(f"      Role: {field['data_role']}")
            if field.get("default_aggregation"):
                lines.append(f"      Default Aggregation: {field['default_aggregation']}")

    with open(output_path, "w") as f:
        f.write("\n".join(lines))
    print(f"  Written: {output_path}")


def sanitize_filename(name):
    """Create a safe filename from a topic name."""
    safe = "".join(c if c.isalnum() or c in ("-", "_", " ") else "_" for c in name)
    return safe.strip().replace(" ", "_").lower()


def main():
    parser = argparse.ArgumentParser(
        description="Convert Amazon Quick Topics metadata to Dataset Enrichment files."
    )
    parser.add_argument("--account-id", required=True, help="AWS account ID (12 digits)")
    parser.add_argument("--topic-id", help="Specific topic ID to convert. If omitted, converts all topics.")
    parser.add_argument("--format", choices=["yaml", "json", "txt"], default="yaml", help="Output format (default: yaml)")
    parser.add_argument("--region", help="AWS region (uses default if not specified)")
    parser.add_argument("--output-dir", default=".", help="Output directory (default: current directory)")

    args = parser.parse_args()
    client = get_quicksight_client(args.region)
    output_dir = Path(args.output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)

    if args.topic_id:
        topic_ids = [args.topic_id]
        print(f"Converting topic: {args.topic_id}")
    else:
        print(f"Listing all topics in account {args.account_id}...")
        topics = list_topics(client, args.account_id)
        if not topics:
            print("No topics found in this account.")
            sys.exit(0)
        topic_ids = [t["TopicId"] for t in topics]
        print(f"Found {len(topic_ids)} topic(s)")

    for topic_id in topic_ids:
        print(f"\nProcessing topic: {topic_id}")
        try:
            response = describe_topic(client, args.account_id, topic_id)
        except Exception as e:
            print(f"  Error describing topic {topic_id}: {e}")
            continue

        enrichment = convert_topic(response)
        topic_name = enrichment.get("topic_name", topic_id)
        safe_name = sanitize_filename(topic_name)

        if args.format == "yaml":
            write_yaml(enrichment, output_dir / f"{safe_name}_enrichment.yaml")
        elif args.format == "json":
            write_json(enrichment, output_dir / f"{safe_name}_enrichment.json")
        elif args.format == "txt":
            write_txt(enrichment, output_dir / f"{safe_name}_enrichment.txt")

    print("\nDone.")


if __name__ == "__main__":
    main()

Phase 3: Enable Dataset Q&A

  1. Navigate to your dataset in Amazon Quick
  2. Upload the enrichment file generated in Phase 2 to your dataset (see Using the output above)
  3. The system applies this context automatically to every query
  4. Verify RLS/CLS — Dataset Q&A respects existing security (no additional setup)
  5. Ensure the dataset is shared with the relevant users or user groups. Users can only query datasets they have been granted access to — the same sharing model that applies to dashboards and analyses.
  6. Users can start asking questions immediately — the chat orchestrator automatically selects the most relevant dataset(s) to answer each question. Authors can also build custom chat agents with a curated set of datasets and documents for a specific use case. Optionally, users can manually select a dataset via: Open Chat → Knowledge Picker → Add → Datasets → Select dataset.
Note: Dataset Q&A works with both SPICE and direct query datasets (Amazon Redshift, Amazon Athena, Aurora PostgreSQL, S3 Tables). Direct query datasets have no row limits.

Phase 4: Validate and Transition Users

Run your most common topic queries through Dataset Q&A and compare:

Validation CheckWhat to Look For
Answer accuracyDo results match what Topics returned?
DisambiguationDoes the system handle ambiguous terms correctly?
SQL qualityReview generated SQL via Explainability feature
PerformanceResponse time for complex queries
SecurityVerify RLS/CLS enforced (test with different user roles)

Advanced Scenarios

Migrating Calculated Fields

With Dataset Q&A, pre-built calculated fields become natural language instructions:

Topic Calculated FieldDataset Q&A Equivalent
profit_margin = (revenue - cost) / revenue * 100Add to enrichment: "When asked about profit margin, calculate as (revenue - cost) / revenue * 100"
ytd_revenue = sumIf(revenue, year = current)System handles natively — users ask "What's the YTD revenue?"
moving_avg_7d = window functionDataset Q&A generates window functions automatically from natural language

Topic-on-Sheet Experiences

  • Dataset Q&A currently operates through the chat interface, not the on-sheet search bar
  • Topics linked to sheets can remain in place while users adopt chat-based Dataset Q&A

Multi-Dataset Scenarios

  • Dataset Q&A supports selecting multiple datasets in the knowledge picker
  • The system auto-routes questions to the most relevant dataset
  • For cross-dataset analysis, organize related datasets into a Quick Space

Resources

  • Introducing Dataset Q&A — Launch blog covering the Dataset Q&A capability, how it works, and getting started.
  • Dataset Enrichment — User guide for uploading business context (field descriptions, synonyms, custom instructions) to improve query accuracy.
  • Chat Explanations — User guide for the Explanation feature, including how to view data sources, assumptions, and generated SQL for each answer.
  • Direct Query with S3 Tables — Launch blog on querying S3 Tables directly in Amazon Quick without importing into SPICE.

Frequently Asked Questions

  • Will Topics be deprecated? No. Topics remain fully supported and continue to evolve. For new Q&A deployments, Dataset Q&A with Enrichment is the recommended starting point.
  • Can I use both simultaneously? Yes. Users can query Topics on sheets and use Dataset Q&A via chat. The system auto-routes when "All data and apps" is selected.
  • Is Dataset Q&A available for SPICE? Yes. SPICE datasets work, though subject to SPICE capacity. Direct query datasets have no row limits.
  • Do I lose my Topic synonyms? No — export them into a Dataset Enrichment file. Your curation investment carries forward.
  • How does security work? Dataset Q&A automatically enforces existing RLS and CLS. No additional setup required.
  • Can users see the generated SQL? Yes. Chat Explainability shows generated SQL, reasoning, assumptions, and filters.

Quick Start: Getting Started in 5 Minutes

No configuration required — start querying immediately:

  1. Open Amazon Quick Chat (top-right navigation icon)
  2. Select your dataset via the knowledge picker (Add → Datasets → Choose dataset)
  3. Ask a question in natural language (e.g., "Describe the structure of this dataset")
  4. Review the answer and explore the Explainability panel
  5. (Optional) Upload a Dataset Enrichment file to improve disambiguation and accuracy

About the Authors

Amy Marvin is a Sr. Technical Product Manager for Amazon Quick, focused on AI-powered chat analytics capabilities. She is passionate about removing barriers between people and their data, making it possible for anyone to ask a question and get a trusted answer in seconds. Outside work, she enjoys exploring the DC restaurant scene and biking on nature trails.

Morgan Dutton is a Sr. Technical Program Manager at AWS focused on Amazon Quick, helping enterprise customers transform their productivity with AI-powered work assistants. Based in Seattle, Morgan works at the intersection of generative AI and business applications, enabling organizations to get more value from their data through natural language experiences. When not helping customers accelerate their AI journeys, Morgan can be found exploring the Pacific Northwest.

1 Like