Percentage distribution of values in Q topic

I have Quicksight topic with the following dataset

select Name, Major, Age, Graduated
from (values
(‘Matt’, ‘English’, 20, ‘False’),
(‘Sam’, ‘English’, 25, ‘True’),
(‘Abby’, ‘Biology’, 20, ‘False’),
(‘Nathan’, ‘Biology’, 25, ‘True’)
) as t(Name, Major, Age, Graduated)

Question 1: how can get it to answer the question ‘distribution of majors’ or ‘percentage distribution of majors’? It interprets it as ‘Percent of total number of Major by Name and Major.’ What I want is English 50%, Biology 50%.

Questions like ‘majors by graduated’ or ‘majors by name’ (silly questions for demonstration only) work fine.

Question 2. How can ask the question ‘show me the percentage who already graduated’. The graduated field is already identified as a Dimension/boolean. What I want is something like 50% graduated / didn’t graduate.

Thanks.

1 Like

Good afternoon,

  1. Distribution of Majors or Percentage Distribution of Majors:

To get the distribution or percentage distribution of majors, you can use the CALCULATE_PERCENT function in QuickSight. Here’s how you can do it:

a. Create a new calculated field with the following formula:

CALCULATE_PERCENT(COUNT(Name), CONTEXT_MEASURES([Major]))

This will calculate the percentage of each major out of the total number of records.

b. In the visual, you can use this calculated field as the value, and Major as the dimension. This will give you the percentage distribution of majors.

Alternatively, you can use the COUNT_DISTINCT function to get the distinct count of majors, and then calculate the percentage:

(COUNT_DISTINCT(Major) / SUM(COUNT_DISTINCT(Major))) * 100
  1. Percentage of Graduated vs. Non-Graduated:

To get the percentage of graduated and non-graduated students, you can use a similar approach:

a. Create a new calculated field with the following formula:

CALCULATE_PERCENT(COUNT(Name), CONTEXT_MEASURES([Graduated]))

This will calculate the percentage of graduated and non-graduated students.

b. In the visual, you can use this calculated field as the value, and Graduated as the dimension. This will give you the percentage of graduated and non-graduated students.

Alternatively, you can use the SUM function to get the total count of graduated and non-graduated students, and then calculate the percentage:

(SUM(CASE WHEN Graduated = 'True' THEN 1 ELSE 0 END) / SUM(1)) * 100 AS Graduated_Percent,
(SUM(CASE WHEN Graduated = 'False' THEN 1 ELSE 0 END) / SUM(1)) * 100 AS NonGraduated_Percent

By using these approaches, you can get the desired distribution and percentage information in your QuickSight visualizations.

2 Likes

Thanks @murili.

I am trying to figure out if it is possible to do that using a natural language in Q topics (with minimal calculated fields and enhancing the metadata of the fields or restructuring the data) rather than using classic QuicSight functions.

Note that the key to making these natural language questions work effectively is to ensure that the dataset metadata (field types, dimensions, etc.) is properly configured in QuickSight. If the fields are not recognized correctly, the natural language queries may not produce the desired results.

Question 1: “distribution of majors” or “percentage distribution of majors”

Ensure that the "Major" field is properly identified as a dimension in the dataset.
In the Q Topics, you can ask the following natural language question: "What is the percentage distribution of Major?" This should give you the percentage breakdown of the "Major" field, with the results showing 50% for "English" and 50% for "Biology".

To make this more explicit, you can also ask:
“What is the percentage of each Major?”

This should provide the same result, showing the percentage breakdown of each major.

Question 2: “show me the percentage who already graduated”

Ensure that the "Graduated" field is properly identified as a dimension in the dataset.
In the Q Topics, you can ask the following natural language question: "What is the percentage of Graduated?" This should give you the percentage of records that have "True" for the "Graduated" field, and the percentage of records that have "False" for the "Graduated" field.

To make this more explicit, you can also ask:
“What is the percentage of Graduated and Not Graduated?”

This should provide the same result, showing the percentage of graduated and non-graduated students.

Also, keep in mind that the natural language capabilities in Q Topics are continuously evolving, so the specific phrasing of the questions may need to be adjusted over time as the feature improves.

Hi @abewub,
It’s been awhile since we last heard from you. Did you have any additional questions regarding your initial topic or did the solution provided above assist with your answer?

If we do not hear back within the next 3 business days, I’ll go ahead and mark the solution.

Thank you!