Assumptions of the Pearson’s co-efficient of Correlation
Prof. Pearson’s co-efficient of correlation is based on the following assumptions:
1. Linear relationship
In devising the formulae, Prof. Pearson has assumed that there is a linear relationship between the variables which means that if the values of the two variables are plotted on a scatter diagram, it will give rise to a straight line.
2. Cause and effect relationship
Prof. Pearson has assumed that there is a cause, and effect relationship between the correlated variables which means that a change in the values of one variable is a cause for effecting a change in the value of another variable. According to him, without such relationship, correlation would carry no meaning at all.
3. Normalcy in distribution
It is assumed that the populations from which the data are collected are normally distributed.
4. Multiplicity of causes
Prof. Pearson has assumed further that each of the variables under study is affected by multiplicity of causes so as to form a normal distribution. Variables like age, height, weight, price, demand, supply, yield, temperature, etc. which are usually taken to study correlation are affected by multiplicity of causes.
5. Probable error of measurement
Prof. Pearson has further assumed that there is probability of some error which may creep into the measurement of the co-efficient of correlation. But, the magnitude of such error must lie within a limit which is obtained by the following formula:
PE(r) = .6745 1-r2/√n
Where, r = Co-efficient of correlation, and n = number of pairs of the two variables.
If the constant. 6745 is omitted from the above formula of probable error; we get the standard error of the co-efficient of correlation.
Thus, SE(r) = 1-r2/√n
The above formula of probable error helps us in interpreting the significance of the co-efficient of correlation as follows:
- The correlation is taken to be almost absent, if r < PE(r).
- The correlation is taken to be significant, if r > 6 times PE(r).
- The correlation is taken to be moderate, if r > PE(r) but < 6 times PE(r).
- The limits of the correlation co-efficient of the population, or P(rho) = r ± PE(r).