Key Points

Correlation

12 Sections
  • Definition of Correlation

    Correlation is a statistical analysis that measures and describes the direction and intensity of the relationship between two variables. It examines if a change in one variable is accompanied by a change in another.

  • Correlation Does Not Imply Causation

    A fundamental principle is that correlation measures covariation, not causation. A strong relationship between two variables does not prove that one variable causes the change in the other, as a third factor could be involved.

  • Types of Correlation: Positive and Negative

    Correlation is positive when variables move in the same direction (e.g., income and consumption). It is negative when they move in opposite directions (e.g., price and demand).

  • Scatter Diagram: Visualizing Relationships

    A scatter diagram is a graph that plots pairs of values for two variables to visually inspect their relationship. The pattern of the points indicates the direction (upward or downward) and strength of the correlation.

  • Karl Pearson's Coefficient of Correlation (r)

    This is a precise numerical measure of the degree of linear relationship between two variables. It is also known as the product moment correlation coefficient and is suitable for quantitative data.

  • Properties of Pearson's Coefficient (r)

    The value of r lies between -1 and +1. It is a pure number without any units and is unaffected by changes in the origin or scale of the data.

  • Interpreting the Value of r

    A value of +1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship. Values close to 1 or -1 show a strong relationship.

  • Spearman's Rank Correlation

    Developed by C.E. Spearman, this method measures the correlation between the ranks assigned to data, not their actual values. It is used for qualitative data or when the relationship is non-linear.

  • When to Use Rank Correlation

    Spearman's correlation is preferred when data cannot be measured precisely (like honesty or beauty) or when the dataset contains extreme values (outliers) that would distort Pearson's coefficient.

  • Calculating Spearman's Correlation (rs)

    The formula for Spearman's rank correlation is rs = 1 - [6 * sum(D^2) / (n^3 - n)], where D is the difference between the ranks for each observation and n is the number of observations.

  • Handling Tied Ranks

    When two or more items have the same value, they are assigned the average of the ranks they would have otherwise occupied. A correction factor must be applied to the formula to adjust for these ties.

  • Linear vs. Non-Linear Relationships

    Karl Pearson's method is only suitable for measuring linear relationships. Spearman's rank correlation can be used even when the relationship between variables is non-linear, as long as the direction is consistent.

Quick Revision Tips

  • • Review these points before exams
  • • Make flashcards for better retention
  • • Connect points to real-world examples
  • • Practice explaining each point in your own words