Before we move into talking about regression, let’s wrap our heads around what a machine learning algorithm is.
A machine learning algorithm is an algorithm that tries to find patterns and build predictions with the help of supported proof in presence of some error.
Regression is a procedure that lets us predict a continuous target variable with the help of one or more explanatory variables.
One real-life example is the marks of scholars as the target variable and the number of hours in preparation for a test as the explanatory variable.
1. Linear Regression: a machine learning algorithm that comes below supervised learning. It is the method to predict the dependent variable (y) based on the given independent variable. So, regression finds a linear relationship between x (input) and y (output).
where Y: output or target variable
X: input/dependent variable
β1: Intercept
β2: constant of X
2. Multiple Linear Regression: it’s simple as its name, to elucidate the connection between the target variable and two or more explanatory variables. Multiple linear regression is used to do any kind of predictive analysis as there is more than one explanatory variable.
Understanding slope and intercept in regression:
Slope: Slope is what tells you how much your target variable will change as the independent variable increases or decreases.
The formula of the slope is y=mx+b
Intercept: The y-intercept is wherever the regression curve y=mx+b crosses the y axis (where x=0), and is denoted by b.
The formula to calculate intercept is b= y -mx
When slope and intercept are going to be placed into the formula y=mx+b, then you may get the description of the best-fit line.
Prerequisites:
• Correlation®: explains the association among variables within the data
• Variance: the degree of the spread of the data
• Standard deviation: the square root of the variance
• Normal distribution: a continuous probability distribution, it’s sort of a bell curve in which the right side of the mean is the mirror of the left side
• Residual (error term): actual value (which we’ve found within the dataset) minus expected value (which we have predicted in linear regression)
• The dependent/target variable is continuous
• There isn’t any relationship between the explanatory/independent variables (no multicollinearity)
• There should be a linear relationship between target/dependent and explanatory variables
• Residuals should follow a normal distribution
• Residuals should have constant variance
• Residuals should be independently distributed/no autocorrelation
Cost function measures how a machine learning model performs.
Cost function is the calculation of the error between predicted values and actual values, represented as a single real number.
The difference between the cost function and loss function is as follows:
The cost function is the average error of n-samples in the data (for the whole training data) and the loss function is the error for individual data points (for one training example).
The cost function of a linear regression is root mean squared error or mean squared error. They are both the same; just we square it so that we don’t get negative values.
Now you will be thinking about where the slope and intercept come into the picture. So here it is.
J=1/nsum(square(pred-y))
J=1/nsum(square(pred –(mx+b))
Y=mx +b
Linear Regression Code:
Some examples of linear regression:
Impact of product price and number of sales
Agricultural scientists use linear regression to measure the effect of fertilizer on the number of crops yielded
Impact of drug dosage on blood pressure
CHECK OUT TOPCODER PYTHON FREELANCE GIGS