Lines of Regression: Definition, Formula, Calculation, Examples

Want to know what lines of regression are? Look no further! Our guide explains the definition, formula, and calculation with real-world examples. Master regression analysis today!

by J Nandhini

Updated Mar 14, 2023

Advertisement
Lines of Regression: Definition, Formula, Calculation, Examples

What are lines of regression?

A line of regression is a straight line that represents the relationship between two variables in a scatter plot. The line of regression is used to predict the value of one variable based on the value of the other variable. The line is drawn in such a way that the sum of the squares of the distances between the line and each point in the scatter plot is minimized.

Article continues below advertisement

Types of lines of regression

There are two types of lines of regression: simple linear regression and multiple linear regression.

Simple linear regression

Simple linear regression is used when there is only one independent variable and one dependent variable. The line of regression is a straight line that best represents the relationship between the two variables.

Multiple linear regression

Multiple linear regression is used when there are two or more independent variables and one dependent variable. The line of regression is a plane that best represents the relationship between the variables.

Simple linear regression formula

The formula for the line of regression in simple linear regression is:

Article continues below advertisement

y = a + bx

where:

  • y is the dependent variable
  • x is the independent variable
  • a is the y-intercept
  • b is the slope of the line
Article continues below advertisement

Multiple linear regression formula

The formula for the line of regression in multiple linear regression is:

y = a + b1x1 + b2x2 + ... + bnxn

where:

Article continues below advertisement

  • y is the dependent variable
  • x1, x2, ..., xn are the independent variables
  • a is the y-intercept
  • b1, b2, ..., bn are the slopes of the line for each independent variable

Simple linear regression calculation

To calculate the line of regression in simple linear regression, we need to calculate the values of a and b.

The formulas for a and b are:

Article continues below advertisement

b = (n∑xy - ∑x∑y) / (n∑x^2 - (∑x)^2)

a = (1/n) ∑y - b(1/n)∑x

where:

Article continues below advertisement

  • n is the number of observations
  • ∑xy is the sum of the product of x and y
  • ∑x is the sum of x
  • ∑y is the sum of y
  • ∑x^2 is the sum of the squares of x

Multiple linear regression calculation

To calculate the line of regression in multiple linear regression, we need to use matrix algebra. The calculations are more complex than in simple linear regression, but the principle is the same.

Example of simple linear regression

Suppose we want to examine the relationship between a person's height (in inches) and their weight (in pounds). We collect data on 10 individuals and plot the data on a scatter plot. The scatter plot shows a positive linear relationship between height and weight. We want to find the line of regression that best represents this relationship.

Article continues below advertisement

Here are the data we collected:

Height (inches)

Weight (pounds)

63

127

65

140

68

157

71

170

72

170

73

175

74

181

75

189

76

193

77

200

Using the formulas for a and b, we can calculate the line of regression as:

Article continues below advertisement

y = 97.02 + 2.89x

where:

  • y is the predicted weight in pounds
  • x is the height in inches
  • 97.02 is the y-intercept
  • 2.89 is the slope of the line
Article continues below advertisement

So, if a person is 70 inches tall, we can predict their weight to be:

y = 97.02 + 2.89(70) = 185.72 pounds

Example of multiple linear regression

Suppose we want to examine the relationship between a person's income, education, and work experience on their job satisfaction. We collect data on 20 individuals and plot the data on a scatter plot. The scatter plot shows a positive relationship between income, education, work experience, and job satisfaction. We want to find the plane of regression that best represents this relationship.

Article continues below advertisement

Here are the data we collected:

Income (in thousands)

Education (years)

Work experience (years)

Job satisfaction (scale of 1-10)

30

12

2

5

45

16

5

7

50

18

8

8

55

19

9

8

65

20

12

9

70

22

15

9

80

24

18

10

85

25

20

10

90

27

23

10

95

28

25

10

100

30

27

10

110

32

30

10

115

33

32

9

120

34

35

9

125

35

37

8

130

36

40

8

140

38

42

7

145

39

45

7

150

40

47

6

160

42

50

5

Using matrix algebra, we can calculate the plane of regression as:

Article continues below advertisement

y = -8.08 + 0.37x1 + 0.57x2 + 0.07x3

where:

  • y is the predicted job satisfaction on a scale of 1-10
  • x1 is the income
  • x2 is the education in years
  • x3 is the work experience in years
  • -8.08 is the intercept
  • 0.37, 0.57, and 0.07 are the slopes of the plane for income, education, and work experience, respectively.
Article continues below advertisement

So, if a person has an income of $80,000, an education of 18 years, and 10 years of work experience, we can predict their job satisfaction to be:

y = -8.08 + 0.37(80) + 0.57(18) + 0.07(10) = 9.39

Conclusion

Lines of regression are an essential statistical tool that can help us understand and analyze the relationships between variables in a dataset. The formula for a line of regression can be calculated using simple algebraic equations, and the process can be extended to multiple variables to calculate a plane of regression.

Understanding lines of regression can help us make more accurate predictions and draw meaningful conclusions from data. By analyzing the slope and intercept of a line or plane of regression, we can make predictions about the value of the dependent variable based on the value of the independent variable(s).

Overall, lines of regression provide a powerful tool for understanding and analyzing data in many different contexts and can help us gain insights into complex relationships between variables.



Disclaimer: The above information is for general informational purposes only. All information on the Site is provided in good faith, however we make no representation or warranty of any kind, express or implied, regarding the accuracy, adequacy, validity, reliability, availability or completeness of any information on the Site.

Lines of Regression: Definition, Formula, Calculation, Examples - FAQs

1. What is a line of regression?

A line of regression is a line that represents the linear relationship between two variables in a dataset.

2. What is the formula for a line of regression?

The formula for a line of regression is y = a + bx, where y is the dependent variable, x is the independent variable, a is the y-intercept, and b is the slope of the line.

3. What is multiple regression?

Multiple regression is a statistical technique that examines the relationship between a dependent variable and multiple independent variables.

4. How is a plane of regression calculated?

A plane of regression is calculated using matrix algebra, where the dependent variable is a linear function of multiple independent variables.

 

5. Why are lines of regression important?

Lines of regression provide a powerful tool for understanding and analyzing data in many different contexts and can help us gain insights into complex relationships between variables. By analyzing the slope and intercept of a line or plane of regression, we can make predictions about the value of the dependent variable based on the value of the independent variable(s).

Advertisement