Lines of Regression: Definition, Formula, Calculation, Examples
by J Nandhini
Updated Mar 14, 2023
What are lines of regression?
A line of regression is a straight line that represents the relationship between two variables in a scatter plot. The line of regression is used to predict the value of one variable based on the value of the other variable. The line is drawn in such a way that the sum of the squares of the distances between the line and each point in the scatter plot is minimized.
Types of lines of regression
There are two types of lines of regression: simple linear regression and multiple linear regression.
Simple linear regression
Simple linear regression is used when there is only one independent variable and one dependent variable. The line of regression is a straight line that best represents the relationship between the two variables.
Multiple linear regression
Multiple linear regression is used when there are two or more independent variables and one dependent variable. The line of regression is a plane that best represents the relationship between the variables.
Simple linear regression formula
The formula for the line of regression in simple linear regression is:
y = a + bx
where:
Multiple linear regression formula
The formula for the line of regression in multiple linear regression is:
y = a + b1x1 + b2x2 + ... + bnxn
where:
Simple linear regression calculation
To calculate the line of regression in simple linear regression, we need to calculate the values of a and b.
The formulas for a and b are:
b = (n∑xy - ∑x∑y) / (n∑x^2 - (∑x)^2)
a = (1/n) ∑y - b(1/n)∑x
where:
Multiple linear regression calculation
To calculate the line of regression in multiple linear regression, we need to use matrix algebra. The calculations are more complex than in simple linear regression, but the principle is the same.
Example of simple linear regression
Suppose we want to examine the relationship between a person's height (in inches) and their weight (in pounds). We collect data on 10 individuals and plot the data on a scatter plot. The scatter plot shows a positive linear relationship between height and weight. We want to find the line of regression that best represents this relationship.
Here are the data we collected:
Height (inches)
Weight (pounds)
63
127
65
140
68
157
71
170
72
170
73
175
74
181
75
189
76
193
77
200
Using the formulas for a and b, we can calculate the line of regression as:
y = 97.02 + 2.89x
where:
So, if a person is 70 inches tall, we can predict their weight to be:
y = 97.02 + 2.89(70) = 185.72 pounds
Example of multiple linear regression
Suppose we want to examine the relationship between a person's income, education, and work experience on their job satisfaction. We collect data on 20 individuals and plot the data on a scatter plot. The scatter plot shows a positive relationship between income, education, work experience, and job satisfaction. We want to find the plane of regression that best represents this relationship.
Here are the data we collected:
Income (in thousands)
Education (years)
Work experience (years)
Job satisfaction (scale of 1-10)
30
12
2
5
45
16
5
7
50
18
8
8
55
19
9
8
65
20
12
9
70
22
15
9
80
24
18
10
85
25
20
10
90
27
23
10
95
28
25
10
100
30
27
10
110
32
30
10
115
33
32
9
120
34
35
9
125
35
37
8
130
36
40
8
140
38
42
7
145
39
45
7
150
40
47
6
160
42
50
5
Using matrix algebra, we can calculate the plane of regression as:
y = -8.08 + 0.37x1 + 0.57x2 + 0.07x3
where:
So, if a person has an income of $80,000, an education of 18 years, and 10 years of work experience, we can predict their job satisfaction to be:
y = -8.08 + 0.37(80) + 0.57(18) + 0.07(10) = 9.39
Conclusion
Lines of regression are an essential statistical tool that can help us understand and analyze the relationships between variables in a dataset. The formula for a line of regression can be calculated using simple algebraic equations, and the process can be extended to multiple variables to calculate a plane of regression.
Understanding lines of regression can help us make more accurate predictions and draw meaningful conclusions from data. By analyzing the slope and intercept of a line or plane of regression, we can make predictions about the value of the dependent variable based on the value of the independent variable(s).
Overall, lines of regression provide a powerful tool for understanding and analyzing data in many different contexts and can help us gain insights into complex relationships between variables.
Lines of Regression: Definition, Formula, Calculation, Examples - FAQs
A line of regression is a line that represents the linear relationship between two variables in a dataset.
The formula for a line of regression is y = a + bx, where y is the dependent variable, x is the independent variable, a is the y-intercept, and b is the slope of the line.
Multiple regression is a statistical technique that examines the relationship between a dependent variable and multiple independent variables.
A plane of regression is calculated using matrix algebra, where the dependent variable is a linear function of multiple independent variables.
Lines of regression provide a powerful tool for understanding and analyzing data in many different contexts and can help us gain insights into complex relationships between variables. By analyzing the slope and intercept of a line or plane of regression, we can make predictions about the value of the dependent variable based on the value of the independent variable(s).