Key Points
Connecting the Dots...
Statistical Question
A statistical question is one that can be answered by collecting data where variability is expected. For example, "How tall are the students in my class?" expects a range of answers, not a single one.
Arithmetic Mean or Average
The mean is a representative value calculated by summing all data values and dividing by the number of values. The formula is .
Median: The Middle Value
The median is the middle value of a dataset that has been sorted in ascending or descending order. It is a measure of central tendency that divides the data into two equal halves.
Calculating Median for an Odd Number of Values
If a sorted dataset has an odd number of values, the median is the single value in the middle position. For the data , the median is .
Calculating Median for an Even Number of Values
If a sorted dataset has an even number of values, the median is the average of the two middle values. For the data , the median is .
Outliers in a Dataset
An outlier is a data value that is significantly higher or lower than the other values in a dataset. For example, in the set , the value is an outlier.
Effect of Outliers on the Mean
The mean is strongly affected by outliers because its calculation includes every value. A very high outlier will increase the mean, while a very low outlier will decrease it.
Effect of Outliers on the Median
The median is not significantly affected by outliers. Since it only depends on the middle position, extreme values at the ends do not change it much, making it a better measure for skewed data.
Mean, Median, and Data Distribution
If mean median, the data is likely skewed by a high outlier. If mean median, it is likely skewed by a low outlier. If the mean and median are close, the data is likely symmetric or balanced.
Range as a Measure of Spread
The range of a dataset describes its variability or spread. It is calculated as . A larger range indicates greater spread in the data.
Dot Plots for Data Visualization
A dot plot displays data points as dots above a number line. It is useful for quickly seeing the distribution, spread, clusters, gaps, and outliers in a dataset.
Clustered Bar Graphs for Comparison
A clustered or double bar graph displays two or more sets of data side-by-side for the same categories. This makes it easy to compare the values between the different sets for each category.
Zero Value vs. Missing Data
When calculating the mean, a data value of must be included in the sum and counted in the total number of values. Missing data, however, should be excluded from both the sum and the count.
Interpreting Graphs in Two Steps
To understand a graph, first identify what is given: axes, scale, labels, and patterns. Second, infer from the data by analyzing and interpreting your observations to draw conclusions.
Quick Revision Tips
- • Review these points before exams
- • Make flashcards for better retention
- • Connect points to real-world examples
- • Practice explaining each point in your own words