How to import datasets from sklearn

There are a few small datasets present in the sklearn library. These datasets can easily be excess and don’t need to download files from any external website. In this article, I will import six datasets from the scikit-learn library. If you want to learn Machine Learning then I will highly recommend you to read This Book.

How to import datasets from sklearn
How to import datasets from sklearn

How to import Iris plants dataset from sklearn

In the Iris plant dataset, there are four features sepal length, sepal width, petal length, petal width. All the values are in centimeters (cm). There are three classes in the target variable Setosa(0), Versicolor(1), Virginica(2). There are 150 instances (50 for each class).

from sklearn import datasets
iris_dataset = datasets.load_iris()

How to import Boston house prices dataset from sklearn

The total number of instances in the Boston house prices dataset is 506. There are 13 features and MEDV Median value is the target variable.

from sklearn import datasets
boston_dataset = datasets.load_boston()

How to import Breast cancer dataset from sklearn

The total number of instances in the Breast cancer dataset is 569. There are 30 features and malignant and benign are two classes in the target variable.

from sklearn import datasets
cancer_dataset = datasets.load_breast_cancer()

How to import Diabetes dataset from sklearn

The total number of instances in the Diabetes dataset is 442. There are 10 features and a quantitative measure of disease progression is the target variable.

from sklearn import datasets
diabetes_dataset = datasets.load_diabetes()

How to import Optical recognition of handwritten digits dataset from sklearn

The total number of instances in the handwritten digits dataset is 1797. There are 64 features and 8×8 images of integer pixels that are in the range of 0 to 16.

from sklearn import datasets
digits_dataset = datasets.load_digits()

How to import Linnerrud dataset from sklearn

The total number of instances in the Linnerrud dataset is 20. There are 3 exercises (features) chins, situps, jumps, and 3 physiological (target) variables weight, waist, and pulse.

from sklearn import datasets
linnerud_dataset = datasets.load_linnerud()

People are also reading:

Best Python Books

What is Computer Vision? Examples, Applications, Techniques

Top 10 Computer Vision Books with Python

Books for Machine Learning (ML)

Free Learning Resources

Leave a Comment

Your email address will not be published.