Running Standard Deviation

Hi, I have a dataset of score and score percentile. I would like to get score top 10 percent, top 20 percent, top 30 percent etc.'s standard deviation (top 10 percent of scores means 100% - 90% score percentile, top 20 percent of scores means 100% - 80% score percentile etc.).

Calculated function StDev can only give me the standard deviation of score percentile 100% - 90%, score percentile 90% - 80%, score percentile 80% - 70% etc. but not the cumulative percentile standard deviation that I need.

Is there a way to get the running standard deviation?

Thank you!

Hi Tia_Song,

Please see if this percentileDisc function works for you.

If this function does not work, please provide some sample data and expected result in a spreadsheet? You can send directly to my email.

Thank you.

Alex

Hi Alex, my question is to calculate running standard deviation calculation but not percentile. I have score percentile column in my dataset. Will send over my sample dataset. Thank you.

image

To replicate the above use case with running standard deviation, try this approach:

  1. AvgTotal = avgOver({Score Measure}, [ ],PRE_FILTER)
  2. Deviation of Each Data Point = {Score Measure}-{1. AvgTotal}
  3. Sq Deviation = {2. Deviation of Each Data Point}^2
  4. sum up the sqed deviation = sumOver({3. Sq Deviation},[ ],PRE_FILTER)
  5. Sqrt sum of sqed deviation = sqrt({4. sum up the sqed deviation}/(countOver({Score Dimension},[ ],PRE_FILTER)-1))

Recommendations:

  1. SQL (Union All and Built-In Functionality)
  2. Amazon SageMaker Studio Data Wrangler / Numpy and Pandas - Workshop
  3. AWS Glue Studio (ETL) - Details
  4. For more advanced functionality RStudio on Amazon SageMaker - Details

image

To replicate the above use case with running standard deviation, try this approach:

  1. AvgTotal = avgOver({Score Measure}, [ ],PRE_FILTER)
  2. Deviation of Each Data Point = {Score Measure}-{1. AvgTotal}
  3. Sq Deviation = {2. Deviation of Each Data Point}^2
  4. sum up the sqed deviation = sumOver({3. Sq Deviation},[ ],PRE_FILTER)
  5. Sqrt sum of sqed deviation = sqrt({4. sum up the sqed deviation}/(countOver({Score Dimension},[ ],PRE_FILTER)-1))

Recommendations:

  1. SQL (Union All and Built-In Functionality)
  2. Amazon SageMaker Studio Data Wrangler / Numpy and Pandas - Workshop
  3. AWS Glue Studio (ETL) - Details
  4. For more advanced functionality RStudio on Amazon SageMaker - Details