Advertisement

Feature Selection and Dimensionality reduction using Covariance Matrix Heatmap

Feature Selection and Dimensionality reduction using Covariance Matrix Heatmap This Python tutorial explain how to handle one of the most common issues in Data Science and Data analysis.
It is a Feature Selection and Dimensionality reduction. There are a wide range how to do it, but in this video I demonstrate one of the quickest way that is suitable for both beginners and data scientist, machine learning experts.

It is a data inspection, feature selection from pairplot (made by seaborn) and from heatmap (seaborn and matplotlib).

For implement this solution you must have installed following Python modules:
- Numpy (
- Pandas (
- Matplotlib (
- Seaborn (

The content of the video:
0:09 - Introduction and some theory.
1:59 - CODING PART BEGIN. Preparing Python modules.
2:14 - Reading Dataset with Pandas.

Step #1.
2:41 - Inspecting imported dataframe (features).

Step #1.1
2:49 - Selecting Numerical and Dummy (if exists) variables from dataset.

Step #1.2
3:21 - Generate a pairplot with Seaborn.

Step #2 and Step #2.1
3:42 - Variable selection from Covariance Matrix. Scaling features from raw dataset.

Sep 2.2
4:05 - Generate Covariance Matrix with Matplotlib and Seaborn.
5:08 - Selecting cmap (colormap) value for heatmap from Seaborn official documentation.
6:04 - Result. Covariance Matrix showing Correlation coefficients between selected features.

Step # 3.
6:16 - Construct Pandas DataFrame from selected the most important features.
6.45 - The result. Constructed Pandas DataFrame from the most important features.

--------
Selecting Seaborn and Matplotlib colormap:

This video is created to demonstrate an idea how to implement feature engineering for feature selection and dimensionality reduction with very simple dataset.
In real world, please take a strong attention to data pre-processing and data cleaning!

Hoping this useful for data scientist, data analysts and everyone who working with data.

Wishes! - Vytautas.

feature selection,dimensionality reduction,seaborn heatmap,covariance matrix,correlation matrix,heatmap seaborn,heatmap matplotlib,pairplot seaborn,python read data,python heatmap,python feature selection,numpy cov,numpy covariance,numerical variables,numerical features,generate pairplot seaborn,heatmap with seaborn,heatmap with matplotlib,scaling features python,scaling variables,standard scaler,sklearn,colormap matplotlib,colormap seaborn,

Post a Comment

0 Comments