Multiple Linear Regression using R
Linear Regression
The relation between a dependent and an independent variable can be seen or predicted using linear regression models. When two or more independent variables are employed in a regression analysis, the model is referred to as a multiple regression model rather than a linear model.
Simple linear regression is a technique for predicting the value of one variable from the value of another. With linear regression, the relationship between the two variables is represented as a straight line.
In multiple regression, a dependent variable has a linear relationship with two or more independent variables. The dependent and independent variables may not follow a straight line if the relationship is non-linear.
When two or more variables are used to track a response, linear and non-linear regression are used. The non-linear regression is based on trial-and-error assumptions and is comparatively difficult to implement.
Multiple Linear Regression
Multiple linear regression is a statistical analysis technique that uses two or more factors to predict a variable’s outcome. It is also known as multiple regression and is an extension of linear regression. The dependent variable is the one that has to be predicted, and the factors that are used to forecast the value of the dependent variable are called independent or explanatory variables.
Analysts can use multiple linear regression to determine the model’s variance and the relative contribution of each independent variable. There are two types of multiple regression: linear and non-linear regression.
Multiple Regression Equation
The equation for multiple regression with three predictor variables (x) predicting variable y is as follows:
Y = B0 + B1 * X1 + B2 * X2 +B3 * X3
The beta coefficients are represented by the “B” values, which are the regression weights. They are the correlations between the predictor and the result variables.
Yi is predictable variable or dependent variable
B0 is the Y Intercept
B1 and B2 the regression coefficients represent the change in y as a function of a one-unit change in x1 and x2, respectively.
Multiple Linear Regression Assumptions
I. The Independent Variables are not Much Correlated
Multicollinearity, which occurs when the independent variables are highly correlated, should not be present in the data. This will make finding the specific variable that contributes to the variance in the dependent variable difficult.
II. Relationship Between Dependent and Independent Variables
Each independent variable has a linear relationship with the dependent variable. A scatterplot is constructed and checked for linearity to check the linear relationships. If the scatterplot relationship is non-linear, the data is transferred using statistical software or a non-linear regression is done.
III. Observation Independence
The observations should be of each other, and the residual values should be independent. The Durbin Watson statistic works best for this.
Multiple Linear Regression in R
Analyzing the linear relationship between Stock Index Prices and Unemployment rate in the Economy.
Multiple linear regression can be done in a variety of methods, although statistical software is the most popular. R, a free, powerful, and easily accessible piece of software, is one of the most widely used. We’ll start by learning how to perform regression with R, then look at an example to make sure we understand everything.
Steps to Apply Multiple Linear Regression in R
Step 1: Data Collection
The data needed for the forecast is gathered and collected. The purpose is to use two independents
· Unemployment Rate
· Interest Rate
to forecast the stock index price (the dependent variable) of a fictional economy.
Step 2: Capturing the Data in R
Data Capturing using the code in R and Importing Excel file from save folder.
Step 3: Checking Data Linearity with R
It is critical to ensure that the dependent and independent variables have a linear relationship. Scatter plots or R code can be used to do this. Scatter plots are a quick technique to check for linearity. We need to make sure that various assumptions are met before using linear regression models. Most importantly, you must ensure that the dependent variable and the independent variable(s) have a linear relationship.
In this we’ll check the Linear Relationship exist between:
· Stock Index Price (Dependent Variable) and Interest Rate (Independent Variable)
· Stock Index Price (Dependent Variable) and Unemployment Rate (Independent Variable)
Below is the code that is used in R to plot the relations between the dependent variable that is Stock Index Price and Interest Rate.
From the Graph we can notice that there is Indeed a Linear relationship exist between the dependent variable Stock Index Price and Independent variable Interest Rate.
In can be specifically noted that When interest rates rise, the price of the stock index rises as well.
In the second scenario, we can plot the link between the Stock Index Price and the Unemployment Rate using the code below:
As you can see, the Stock Index Price and the Unemployment Rate have a linear relationship: when unemployment rates rise, the stock index price drops. we still have a linear correlation, although with a negative slope.
Step 4: Perform Multiple Linear Regression In R
To generate a set of coefficients, use code to conduct multiple linear regression in R. Template to perform Multiple Linear Regression in R is as below:
M1 <- lm (Dependent Variable ~ First Independent Variable + Second Independent Variable, Data= X)
Summary (M1)
Using the Template, the Code follows:
We will get the following summary if you execute the code in R:
The model’s residuals (‘Residuals’). The model fulfils heteroscedasticity assumptions if the residuals are roughly centered around zero and have similar spread on both sides (median -6.248, and min and max -158.2 and 118.8). The model’s regression coefficients ('Coefficients’).
To construct the multiple linear regression equation, utilize the coefficients from the summary as follows:
Stock Index Price = (Intercept) + (Interest Rate) X1* (Unemployment Rate) X2
Once you’ve entered the numbers from the summary:
Stock Index Price = (1798.4) + (345.5) X1 * (-250.1) X2
Adjusted R-squared: Measures the model’s fit, with a higher number indicating a better fit.
The p-value is Pr(>|t|): Statistical significance is defined as a p-value of less than 0.05.
Step 5: Make Predictions
To predict the Stock Index Price from the collected Data it is noted that,
X1 <= Interest Rate = 1.5
X2 <= Unemployment Rate = 5.8
And when this data in equated into the regression Equation we obtain:
Stock Index Price = (1798.4) + (345.5) * (1.5) + (-250.1) * (5.8)
Stock Index Price = 866.066666
The Final Predicted data for the stock Index Price using Multiple Linear Regression is 866.07.
Conclusion
The stock market and our economy’s relationship frequently converge and diverges from one another. The gross domestic product, unemployment, inflation, and a slew of other indicators all represent the state of the economy. These trends are expected to show the economy and markets moving in lockstep in the long run.
When the unemployment rate is high, the Monetary Policy lowers the interest rate, which causes stock market prices to rise.
Unemployment increases often signify a drop in interest rates, which is good for stocks, as well as a drop in future corporate earnings and dividends, which is bad for stocks. Therefore, it is notable that Interest rate and Unemployment rate affect the stock market prices in the economy and have a linear relationship between the variables.
14 notes
·
View notes