2.8, 2.9 - Measuring grouped data
Measures of Median and Mean on Grouped Data
2.7 - Connecting measures together focused on calculating the descriptive statistics on raw data, where we see every single sample. Now, we'll focus on calculating the descriptive statistics on grouped data.
Mean and median of data in a Frequency Table
| Student Score | Frequency |
|---|---|
| 3 | 1 |
| 4 | 1 |
| 5 | 3 |
| 6 | 5 |
| 7 | 5 |
| 8 | 7 |
| 9 | 5 |
| 10 | 3 |
We could recreate the original data set
Sample mean
Population mean
Mean and median of data in a relative frequency table
Sample mean
Population mean
Weighted measures
Exactly the same as relative frequency, where the weight takes place of relative frequency
Central measures on grouped data with loss of information
Previously, we only worked on data where we had exact values known to us. Now, we'll focus on situation where we are given intervals of data
| Age Intervals | Frequency | Relative Frequency |
|---|---|---|
| [20, 29) | 1441 | |
| [30, 39) | 2477 | |
| [40, 49) | 4971 | |
| [50, 59) | 7438 | |
| [60, 69) | 6367 | |
| [70, 79) | 2314 | |
| 80+ | 458 | |
| TOTAL | 25466 |
We cannot estimate exact mean or median of the values, because we do not know what are the exact values in the age intervals
Median
We find in which interval does the
Mean
To calculate the mean, we can use the midpoint
Sample mean
Product mean
Variance and Standard Deviation on Grouped Data
Measures of spread
When taking the sample of a population, we find that usually the calculated variance and standard deviation are a substancially low compared to the actual population values, when using traditional formulas. For this reason, we introduce a slight variation for sampled calculations.
For grouped data, we again take the midpoint of the intervals to calculate the measures
Variance for sample data
Standard deviation for sample data
Those can also be derived for relative frequency tables, but only for populations, because for samples, the addition of