A spurious correlation is a mathematical relationship between two or more variables that are associated but not causally related, due to either coincidence or the presence of a certain third, unseen factor. In other words, it appears like values of one variable cause changes in the other variable, but thats not actually happening. Spurious correlations can occur for several reasons, including confounding variables, small sample sizes, or arbitrary endpoints.
Some key points about spurious correlations include:
-
Appearance of Causality: Spurious correlations look like causal relationships in both their statistical measures and in graphs, but they are not real.
-
Detecting Spuriousness: The most obvious way to spot a spurious relationship in research findings is to use common sense. In studies, all variables that might impact the findings should be included in the statistical model to control their impact on the dependent variable.
-
Examples: Spurious correlations can appear in the form of non-zero correlation coefficients and as patterns in a graph. For instance, an example of a spurious correlation is the correlation between U.S. crude oil imports from Norway and drivers killed in a collision with a railway train.
It is important to be aware of spurious correlations, as they can lead to incorrect conclusions and decisions.