Linear Regression

Linear Regression is a supervised machine learning algorithm used for regression tasks.

It’s considered the most basic ML algorithm and forms the foundation for other algorithms like Logistic Regression.

What is Linear Regression?

Linear Regression calculates the linear relationship between a dependent variable (target)

and one or more independent variables (features).

This relationship is expressed through a linear prediction function.

The main goal is to find the best fit line — a straight line that minimizes the distance between itself and the data points on a scatter-plot.

Linear Prediction Function Formula:

Y = WX + B

( Equation of simple linear Regression )

Y : predicted output
W: weight (coefficient)
X : input feature
B : bias (intercept)

Illustration:

Weight and Bias :

Weights and bias are key parameters in Linear Regression:

Weight () determines the slope of the line, showing how much the dependent variable changes with the independent variable.
Bias () shifts the line up or down to better fit the data.

Accuracy Metrics

To evaluate the performance of a Linear Regression model, we use several metrics:

Mean Squared Error (MSE)

MSE measures the average squared difference between actual and predicted values.

It emphasizes larger errors due to squaring.

MSE Formula:

MSE = 1/n*(sum( y - y_pred )**2 )

Mean Absolute Error (MAE)

MAE calculates the average absolute difference between actual and predicted values.

It provides a linear score without amplifying large errors.

MAE Formula:

MSE = 1/n*(sum( | y - y_pred | ))

R² Score

The R² score indicates how well the independent variables explain the variability of the dependent variable.

A value closer to 1 means a better fit.

R² Score Formula:

R2 Score = 1 - sum(y - y_pred)**2 / sum(y - y_mean)**2

Cost Function or Loss Function

In Linear Regression, we use the Mean Squared Error (MSE) as the cost function to determine how well our model fits the data.

How to Find the Best Fit Line

The best fit line is obtained by finding the optimal values for weights and bias.

This is achieved by minimizing the cost function using an optimization technique called Gradient Descent.

Gradient Descent for Linear Regression

Gradient Descent is an optimization algorithm that iteratively adjusts the parameters

to minimize a function by following the direction of steepest descent.

Learning Rate: A small scalar value that determines the step size during each iteration.

Gradient Definition: The derivative of a function with respect

to multiple independent variables (e.g., weights and bias).

How Does Gradient Descent Find Optimal Weights and Bias?

Gradient Descent calculates the gradients of the cost function with respect to weights and bias

(i.e., weight gradient and bias gradient).

These gradients are subtracted from the current values of the parameters using an Update Rule to find optimized values.

Weight and Bias Gradient Calculation

The cost function used here is MSE:

MSE Formula:

Partial Derivatives:

Weight Gradient = -2x /n ( y - (wx +b)
Bias Gradient = -2 /n ( y - (wx +b) )

These gradients point in the direction of steepest ascent,

so we move in the opposite direction to minimize the cost.

This is done by subtracting the product of the gradient

and the learning rate from the current parameter values:

Update Rule:
Parameter = Parameter - lr * Gradient

lr : learning rate

Atul Deshpande

Saturday, December 14, 2024