Sklearn Multiple Linear Regression | Linear Regression

Linear Regression is a supervised machine-learning algorithm. Linear regression is used for regression problems. We can quickly implement multiple linear regression in Python using the Sklearn library. You have to follow the given steps to implement the multiple linear regression. In this tutorial, I will use this dataset for multiple linear regression. It is available on Kaggle. You can also use your own dataset.

Step 1: Import the required libraries

import pandas as pd
import numpy as np

Step 2: Read the dataset

After downloading the dataset, read the dataset from the location.

dataset = pd.read_csv('regression_data.csv')
   age  experience  income
0   25           1   30450
1   30           3   35670
2   47           2   31580
3   32           5   40130
4   43          10   47830

Step 3: Separate the independent and dependent variables

X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 2].values

Step 4: Split the dataset into the Training set and Test set

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

Step 5: Fitting Multiple Linear Regression to the Training set using Sklearn

from sklearn.linear_model import LinearRegression
model = LinearRegression(), y_train)

Step 6: Prediction on the Test set

y_pred = model.predict(X_test)

Step 7: Accuracy on the training set and test set

from sklearn.metrics import r2_score
print("Accuracy on training set: ", r2_score(y_train, model.predict(X_train)))
print("Accuracy on test set:", r2_score(y_test, y_pred))
Accuracy on training set:  0.9874264004414047
Accuracy on test set: 0.9307237996010831

Step 8: Coefficients and Intercept

print("Coefficients: ", model.coef_)
print("Intercept: ", model.intercept_)
Coefficients:  [-107.40804718 2168.87369682]
Intercept:  31734.098811233776

Step 9: Other Evaluation Metrics: MAE, MSE, RMSE

from sklearn.metrics import mean_absolute_error, mean_squared_error
print("MAE: ", mean_absolute_error(y_test, y_pred))
print("MSE: ", mean_squared_error(y_test, y_pred))
print("RMSE: ", np.sqrt(mean_squared_error(y_test, y_pred)))
MAE:  1666.4074288570423
MSE:  3570198.8664224693
RMSE:  1889.49698767224

Related Articles:

Leave a Comment

Your email address will not be published.