Correlation
Recall the numerical range for Karl Pearson's coefficient of correlation.
Create a simple dataset with five pairs of values for variables X and Y that would demonstrate a perfect negative correlation (r = -1).
Define the term correlation in the context of statistics.
Name the two primary types of correlation based on the direction of change in variables.
Apply your understanding of correlation to a dataset showing that as the number of hours spent playing video games increases, a student's test scores decrease. What type of correlation is this?
Define negative correlation.
Recall the name of the statistical tool used to visually represent the relationship between two variables.
Examine a scatter diagram where points are clustered closely around a line that slopes downwards from left to right. What type of correlation does this demonstrate?
Analyze the meaning of a Karl Pearson's correlation coefficient of -0.9 between study hours and failure rate.
Contrast positive and negative correlation by providing one real-world example for each and explaining how the variables move in relation to each other.
Formulate a research question concerning student satisfaction in a school where Spearman's rank correlation would be the only viable method of analysis.
Justify the statement: 'A correlation coefficient is a more effective measure of association than covariance'.
Explain why the correlation coefficient (r) has no unit of measurement.
Identify what a correlation coefficient of r = 0 indicates about the relationship between two variables.
Describe a situation where Spearman's rank correlation would be a more appropriate measure than Karl Pearson's coefficient.
Design an experiment to test the correlation between hours of sleep and cognitive performance for students. Justify which correlation method (Pearson's or Spearman's) would be more appropriate for your design.
Examine the validity of the statement: 'A Karl Pearson's correlation coefficient of zero (r=0) means there is no relationship whatsoever between the two variables.'
Analyze two key properties of the Karl Pearson's correlation coefficient (r): its range and its invariance to change of origin and scale. Explain the significance of each property.
Justify why Spearman's rank correlation coefficient is considered more robust to outliers than Karl Pearson's coefficient.
Justify the importance of creating a scatter diagram before calculating Karl Pearson's coefficient of correlation.
Evaluate the decision to use Spearman's rank correlation to analyze the relationship between a country's GDP per capita and its 'Happiness Index' score, which is an ordered categorical variable.
Evaluate the usefulness of the property that the correlation coefficient 'r' is unaffected by the change of origin and scale.
Explain the concept of positive correlation using a suitable example.
Critique the statement: 'If the correlation coefficient between the height of fathers and sons is +0.6, it means that 60 percent of a son's height is explained by his father's height.'
List three properties of Karl Pearson's coefficient of correlation.
Summarize the primary purpose of constructing a scatter diagram in correlation analysis.
Describe the interpretation of the correlation coefficient (r) when its value is +1, -1, and close to zero.
Calculate the Karl Pearson's coefficient of correlation for the following data on price (X) and supply (Y) of a commodity and analyze the result. X: [10, 20, 30], Y: [5, 10, 15].
Analyze the relationship between the number of firefighters at a fire and the amount of damage caused. A study finds a strong positive correlation. Does this mean firefighters cause more damage?
Justify the use of a correction factor in Spearman's formula when tied ranks are present.
Two judges in a competition gave the following ranks to five contestants. Judge 1: [1, 2, 3, 4, 5]. Judge 2: [5, 4, 3, 2, 1]. Calculate the Spearman's rank correlation coefficient.
Critique the practice of calculating a correlation coefficient between a student's zip code and their exam scores.
Demonstrate the calculation of Spearman's rank correlation for the following data on marks in Maths (X) and Science (Y) for 4 students. X: [80, 65, 90, 70], Y: [75, 70, 85, 60].
Formulate a policy recommendation for a city council based on a newly discovered strong positive correlation (r = +0.8) between the number of public parks and the average property value in neighborhoods.
Examine why Karl Pearson's correlation coefficient (r) is a pure number and has no unit of measurement.
Analyze the finding that there is a high positive correlation between shoe size and vocabulary level in children. Does having bigger feet cause a child to know more words?
Explain the statement 'Correlation does not imply causation' with an example.
Evaluate why a perfect correlation (r = +1 or r = -1) is rare when dealing with real-world economic or social data.
Solve for the correlation coefficient between X and Y using the step-deviation method, given N=5, ΣU=0, ΣV=0, ΣU²=10, ΣV²=40, and ΣUV=15. Analyze the strength and direction of the relationship.
Summarize the key differences between Karl Pearson's method and Spearman's rank correlation method.
Propose why two variables, such as the number of schools in a city and the number of crimes, might show a strong positive correlation without one causing the other.
Propose a hypothetical business scenario where a finding of zero correlation between advertising expenditure and sales would be misleading, and justify your reasoning.
Compare and contrast Karl Pearson's coefficient of correlation with Spearman's rank correlation coefficient in terms of their application and sensitivity to data characteristics.
Calculate the Spearman's rank correlation for the following data, applying the correction for tied ranks. X: [10, 20, 15, 20, 30], Y: [50, 60, 50, 70, 80].
Describe the five main patterns that can be identified from a scatter diagram.