A correlation exists between two variables when one of them is related to the other in some way. The values of the two variables can be grouped in pairs.
•A linear correlation exists if there is a straight-line relationship between two variables and It is also called as simple (Pearson) correlation.
Measures the strength of the linear relationship between the paired values representing quantitative data in a sample
it is represented by r and for a given sample with two variables, X and Y, linear correlation coefficient r can be computed by the formula:
n: Number of data pairs (sample size)
X: X-value (independent variable, e.g., cost of pizza slice)
Y: Y-value (dependent variable, e.g., subway fare)
r: Linear correlation coefficient for a sample (measures the strength and direction of the linear relationship between X and Y)
Linear Correlation Coefficient
•It is a statistic
•The value of r does not change if all values of either variable are converted to a different scale.
r>0 |
Positive Correlation |
r<0 |
Negative Correlation |
|r|=1 |
Perfect Correlation |
r=0 |
No Correlation |
|r|=1 |
Strong Correlation |
|r|=0 |
Weak Correlation |
Linear correlation between paired variables in a population
If we had all population values for x and y, the result of using the formula would be a population parameter representing the correlation coefficient between the paired variables.
The linear correlation coefficient for population is represented by.
For this inference, we will use the ‘Pearson Correlation coefficient’ standard table
We will NOT find the value for
We will use the sample to have an inference about if there is really a linear correlation between the paired variables for the population
The Pearson Correlation Coefficient Standard Table
The standard table provides the minimum linear correlation coefficient that samples must exceed to prove the linear correlation between variables for populations, by the considering the sample and test properties:
Sample size
Significance level
The values in standard table can be viewed as critical values for the inference test.
Calculate the linear correlation coefficient (𝑟) for the paired values in a sample.
If |r| exceeds the critical value in the table, conclude that, there is a linear correlation.
Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.
The sample of paired (x, y) data is a random sample of independent quantitative data
Visual examination of the scatter-plot must confirm that the points approximate a straight-line pattern
Any outliers must be removed if they are known to be errors; otherwise they must be considered when calculating r
In a production operation to make steel plates, the price of a kilogram of steel was tracked against the price of a kilowatt of electricity used to run the production equipment.
A relationship between these two parameters was suspected to exist by the production engineer in charge.
The following table was found to emerge over a 5 week period and Find the correlation coefficient r for the paired steel price / price of electricity costs given in the table.
Week | Steel Price (kg) | Electricity Price (kWh) |
---|---|---|
Week 1 | £2.50 | £0.10 |
Week 2 | £2.55 | £0.25 |
Week 3 | £2.64 | £0.40 |
Week 4 | £2.78 | £0.60 |
Week 5 | £2.92 | £0.90 |
•x = steel price, y = price of electricity and n = 5 weeks
Steel (x) | Electricity (y) | x² | y² | xy |
---|---|---|---|---|
2.50 | 0.10 | 6.2500 | 0.0100 | 0.2500 |
2.55 | 0.25 | 6.5025 | 0.0625 | 0.6375 |
2.64 | 0.40 | 6.9696 | 0.1600 | 1.0560 |
2.78 | 0.60 | 7.7284 | 0.3600 | 1.6680 |
2.92 | 0.90 | 8.5264 | 0.8100 | 2.6280 |
Σx = 13.39 | Σy = 2.25 | Σx² = 35.9769 | Σy² = 1.4025 | Σxy = 6.2395 |
On the basis of the above result for r, is it safe to support the conclusion of a linear correlation at a 95% confidence interval?
•For a 95% confidence interval α = 0.05 n = 5
•Refer to the Pearson Correlation Coefficient r Tables df:5, p value =0.878
Interpreting r :
If the absolute value of the computed value of r exceeds the value in the table, conclude that there is a linear correlation
Here computed value of r = 0.9955256426
Value in table = 0.878
Thus, we can conclude a linear correlation at a 95% confidence level
University of Exeter LibGuide is licensed under CC BY 4.0