Exploratory Data Analysis using pandas visual analysis library

Amit Jain

Actively looking for new job | 6.10+ YoE as a Data Scientist

Published Nov 12, 2021

Pandas Visual Analysis is an open-source python library which is used to visually analyze the data and that too in just a single line of code.

It creates a user interface that can be used to create different plots and graphs taking different attributes.

It supports a large variety of graphs and plots, also all the graphs are created using Plotly so that they are highly interactive, visually appealing, and easily downloadable.

Pandas Visual Analysis is a package provided by python for interactive visual analysis in jupyter notebook.

It generates an interactive visual analysis widget to analyze pandas Data Frame.

This allows data exploration and cognition to be simple, even with complex multivariate datasets.

There is no need to create and style plots, it will automate the whole data exploration part.

# import the library

from pandas_visual_analysis import VisualAnalysis


# visualizing different plots

VisualAnalysis(dataset)

Now, we have 3 selection types.

Standard: It describes our dataset. If we write dataset.describe() at that time we will get all these things that are mention in the standard section type.

Subtractive: we have an option to choose particular features and create a scatter plot among them. Subtractive provides one feature that from the scatter plot we can select some of the data points and remove them which will help us to analyze that what is the impact of that particular data points on our dataset. It will not permanently remove the data points, for only exploration purposes it will remove them.

In the below snapshot it is clearly mentioned that first, we remove the data points available in the red highlighted area, then check out the next snapshot where the removed data points are mentioned in grey color(grey color means data points are removed) and we can also see the change occur in the LHS part of both the images due to removing a small set of the data point.

Recommended by LinkedIn

2023 Data Analysis & Visualization in python…

Free Online Courses With Printable Certificates 1 year ago

Python Big Data Exploration & Visualization: A Guide

Analytics Insight® 6 months ago

The 6 components of Open-Source Data Science/ Machine…

Gregory Piatetsky-Shapiro 6 years ago

From the above snapshot, we can easily understand that what is the impact of the red highlighted part on our dataset after removing it.

We have two more graphs in which first is describing that how much data is gone after removing the red highlighted area from that dataset.

Here the greyish part is removed after removing the red highlighted area from the above plot.

It will help us to understand if we remove some certain data points which are far away from the mean of the data set then what will be the impact of removing that specific data points.

Additive: Once we remove the elements if we have added that element again that time additive selection type will help us to add.

Now add all the removed data and look at the changes in our plot.

It also provides the function of normalizing the features.

You can check my GitHub profile for code.

Exploratory Data Analysis using pandas visual analysis library

Amit Jain

Actively looking for new job | 6.10+ YoE as a Data Scientist

Recommended by LinkedIn

More articles by this author

Insights from the community

Others also viewed

Top 10 Tools for data scientists in 2022

Top 10 Python Libraries Every Data Science

Data Science Full Stack Roadmap 2022

Introduction to Quant Investing with Python

Leveraging People and Python in AI for Optimal Data Utilization

Upskill Us for 5IRE in Minitab + Python

Top 10 Tools for data scientists in 2022.

Data Manipulation in Python

Unlocking Time Series Insights with TSFresh: A Python Guide

Empowering Data Analysis with Python: Unleash Your Analytical Superpowers!

Explore topics

Recommended by LinkedIn

How to install WML(Watson Machine Learning) using catalog in Openshift

Sep 14, 2022

Using Fast loading libraries like Vaex

Dec 15, 2021

Shapash : Machine Learning Interpretable & Understandable

Dec 15, 2021

Azure Cognitive Services

Dec 14, 2021

Autoviz & Autovizwidget

Nov 24, 2021

Exploratory Data Analysis Using D-Tale Library

Nov 11, 2021

Exploratory Data Analysis Using Pandas Profiling

Nov 10, 2021

Exploratory Data Analysis with Sweetviz

Sep 8, 2021

Python program to check available slots for Covid vaccination centers in your nearest pin code

May 3, 2021

Insights from the community

Others also viewed

Top 10 Tools for data scientists in 2022

Top 10 Python Libraries Every Data Science

Data Science Full Stack Roadmap 2022

Introduction to Quant Investing with Python

Leveraging People and Python in AI for Optimal Data Utilization

Upskill Us for 5IRE in Minitab + Python

Top 10 Tools for data scientists in 2022.

Data Manipulation in Python

Unlocking Time Series Insights with TSFresh: A Python Guide

Empowering Data Analysis with Python: Unleash Your Analytical Superpowers!

Explore topics