Linear Regression is a supervised machine-learning algorithm. Linear regression is used for regression problems. We can quickly implement multiple linear regression in Python using the Sklearn library. You have to follow the given steps to implement the multiple linear regression. In this tutorial, I will use this dataset for multiple linear regression. It is available on Kaggle. You can also use your own dataset.
Step 1: Import the required libraries
import pandas as pd import numpy as np
Step 2: Read the dataset
After downloading the dataset, read the dataset from the location.
dataset = pd.read_csv('regression_data.csv') print(dataset.head())
age experience income 0 25 1 30450 1 30 3 35670 2 47 2 31580 3 32 5 40130 4 43 10 47830
Step 3: Separate the independent and dependent variables
X = dataset.iloc[:, :-1].values y = dataset.iloc[:, 2].values
Step 4: Split the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
Step 5: Fitting Multiple Linear Regression to the Training set using Sklearn
from sklearn.linear_model import LinearRegression model = LinearRegression() model.fit(X_train, y_train)
Step 6: Prediction on the Test set
y_pred = model.predict(X_test)
Step 7: Accuracy on the training set and test set
from sklearn.metrics import r2_score print("Accuracy on training set: ", r2_score(y_train, model.predict(X_train))) print("Accuracy on test set:", r2_score(y_test, y_pred))
Accuracy on training set: 0.9874264004414047 Accuracy on test set: 0.9307237996010831
Step 8: Coefficients and Intercept
print("Coefficients: ", model.coef_) print("Intercept: ", model.intercept_)
Coefficients: [-107.40804718 2168.87369682] Intercept: 31734.098811233776
Step 9: Other Evaluation Metrics: MAE, MSE, RMSE
from sklearn.metrics import mean_absolute_error, mean_squared_error print("MAE: ", mean_absolute_error(y_test, y_pred)) print("MSE: ", mean_squared_error(y_test, y_pred)) print("RMSE: ", np.sqrt(mean_squared_error(y_test, y_pred)))
MAE: 1666.4074288570423 MSE: 3570198.8664224693 RMSE: 1889.49698767224
Related Articles: