I’m currently using Amazon Quick to upload CSV / JSON files (via Spaces or chat file attachments). These files contain website sales and transaction data, and I want to query the uploaded documents to retrieve specific information, such as the list of products purchased by a particular member email.
However, I’m running into recurring accuracy issues:
In some cases, Quick responds that it cannot find any relevant data, even though the data clearly exists in the uploaded file.
In other cases, it does find related records, but some details are incorrect, or the response appears to include AI hallucinations (for example, wrong product details or mismatched purchase records).
sometimes, both cases above happen randomly in the same environment setting, still not found the reason yet
Because this involves sales and customer data, accuracy is extremely important. These inconsistencies make it difficult for us to confidently adopt this system in a stable production environment.
I’d like to ask a few questions:
Are there any best practices for improving the accuracy and consistency of document-based queries in Amazon Quick?
Is there a way to restrict answers to only the uploaded CSV/JSON content, instead of allowing the model to infer or generate information?
I noticed that Amazon Q Business provides configuration options to reduce hallucinations or enforce more conservative response modes.
→ Does Amazon Quick offer any similar settings or controls?
Does the file size, number of rows, or character/token length of CSV/JSON files affect answer accuracy?
If so, are there recommended approaches (such as file splitting, schema normalization, indexing, or pre-filtering) to reduce incorrect results?
If some level of inaccuracy is expected by design, I’d like to better understand the practical limits of accuracy and whether others have successfully deployed similar use cases in production.
Any insights, documentation, or real-world experience would be greatly appreciated.
Thank you very much!
The file upload + chat experience is intended for exploration and summarization, not deterministic, row-level querying. When asking questions like “which products were purchased by this exact email,” it’s common to see missed records, partial matches, or inferred details, especially as file size or complexity increases. This is an inherent limitation of using document-based RAG queries on structured data, not a data quality issue on your side.
While Amazon Q Business includes hallucination-mitigation controls (Hallucination mitigation in Amazon Q Business - Amazon Q Business), AWS explicitly documents that hallucination mitigation “isn’t supported for responses generated from tabular data”. The feature only applies to textual data, so these controls don’t address CSV/JSON accuracy issues either.
File size and structure do matter. While specific limits exist for chat uploads (CSV/Excel files must be less than 5MB), answer quality can degrade with large row counts, wide schemas, or long text fields. Splitting files and normalizing schemas can help, but only to a point.
Yes, this is pretty normal for Amazon Quick. It’s not a strict query engine; it does semantic retrieval and inference, so accuracy can vary, especially with CSV and JSON.
A few important points:
Files that are big or wide are less accurate. split data, clean up schemas, and normalize columns
There is no real “only use uploaded data” mode, so a little hallucination is normal.
CSV/JSON without a strong structure makes it hard to find things.
Quick is not the same as Amazon Q Business (there are no strict rules or guidelines).
Most people preprocess data or pair it with a deterministic layer (SQL/Athena) for production use and only use Quick for exploration.
In short, this is normal behavior, not a bug it’s just how Amazon Quick works right now.
Just checking back in since we haven’t heard from you in a bit. I wanted to see if the guidance shared earlier helped resolve your question, or if you found a solution in the meantime.
If you still have any additional questions related to your initial post, feel free to share them. Otherwise, any update you’re able to provide within the next 3 business days would be helpful for the community.