Hands-On Tutorial On Holoviews – Automated Visualization Based On Short Data Annotations – Analytics India Magazine

  • MatterAdmin
  • September 16, 2020
  • Comments Off on Hands-On Tutorial On Holoviews – Automated Visualization Based On Short Data Annotations – Analytics India Magazine

Data Visualization is a scientific study of the data in order to find out the anomalies, patterns, or trends in a particular dataset. It can be done using a variety of plots and graphs which we can use to visualize different properties of the attributes of the dataset. Visualization is one of the easiest ways of understanding the data as we can clearly visualize the data with our naked eyes and our brain processes the data to give us a clear picture of what the data is trying to say.

Visualization can be of many types like Bar Charts, Histograms, Scatter Plots, etc. which can be used on different types of data to gain useful insights about the data. Python has a large number of libraries/modules which can be used for data visualization and creating highly informative and attractive graphs and plots. Holoviews is one such library that makes the process of visualization easier such that we can create highly informative and insightful visualizations in a few lines of code.

Holoviews is an open-source python library that makes data visualization easier. Holoviews works on conveying the message that data is trying to tell rather than focusing on how to plot visualizations. Holoviews works on Numpy and Params, and for visualization, it supports ‘Bokeh’ and  ‘Matplotlib’.



In this , we will see how we can create different types of visualizations using Holoviews and how we can manipulate them according to our requirements.

Implementation:

Like any other python library, we will install Holoviews and all its dependencies using pip install holoviews.   

  1. Importing Required Libraries

In this article, we will use two different datasets for different visualization. For loading the dataset we will import Pandas and Seaborn. For visualization purposes, we will import holoviews.

import pandas as pd

import holoviews as hv  

from holoviews import opts

import seaborn as sns

hv.extension('bokeh', 'matplotlib') #extensions used for visualization

  1.  Loading the Dataset

For visualization using Holoviews, we will use an Advertising dataset of an MNC which contains different attributes like Sales, Newspaper, etc. and the second dataset we will be using is a sample dataset defined under seaborn namely the ‘Tips’ Dataset which contains different attributes of restaurant billing history like ‘total bill, ‘tip’, etc.

df = pd.read_csv('Advertising.csv')

df.head()

df1 = sns.load_dataset('tips')

df1.head()

  1. Creating Visualizations

We will start by creating visualizations for the advertising dataset and after that, we will create some advanced visualization using the tips dataset.

  1. Scatter Plot

We will plot a scatter plot between our target variable i.e’ ‘Sales’ and all other feature variables. For creating the visualizations we will start by defining our feature variables and then creating a dataset using Holoviews.

vdims = [('Newspaper'), ('Radio'), ('TV')]

ds = hv.Dataset(df, ['Sales'], vdims)

ds

Now will use this dataset and create the visualization.

layout1= (ds.to(hv.Scatter, 'Sales', 'Newspaper') + ds.to(hv.Scatter, 'Sales', 'TV') +  ds.to(hv.Scatter, 'Sales', 'Radio')).cols(2)

layout1.opts(opts.Scatter(width=400, height=250))

For creating this visualization we used “cols” functions which define the number of columns in which visualization is created that is why we can see the graphs in two rows. This graph is plotted using bokeh as we can see the symbol on the right side top corner, so these graphs are highly interactive and visually appealing.

  1. Bar Plot

We will create a Bar Plot of ‘Sales’ and ‘Radio’. 

layout2 = (ds.to(hv.Bars, 'Sales', 'Radio'))

layout2.opts(opts.Bars(width=900, height=400))

  1. Distribution Plots

We will create distribution plots of all the attributes so that we can visualize how data is distributed among all these attributes.

distribution = (hv.Distribution(ds, ['Sales']) + hv.Distribution(ds, ['TV']) + hv.Distribution(ds, ['Newspaper'])+hv.Distribution(ds, ['Radio'])).cols(2)

distribution.opts( width=400, height=250)

  1. Box Plot 

Now we will the second dataset i.e tips dataset and create some advanced statistical charts like boxplot.

title = 'Total Bill according to Gender'

box = hv.BoxWhisker(df1, ['sex'], 'total_bill', label=title)

box.opts( width=600,  cmap='Set1')

The opts function as seen in the code is used for defining the height and width of the plots and can be used to manipulate other features also.

  1. Violin Plots

We will create violin plots among different attributes of the tips dataset and visualize them.

violin= (hv.Violin(df1, ['day'], 'tip', label='Tip according to Day') + hv.Violin(df1, ['smoker'], 'tip', label='Tip according to Smokers')).cols(2)

violin.opts( width=600)

Conclusion:

In this article, we saw how easily we can create visualizations using holoviews, we started with creating some basic visualizations and after that, we created some advanced visualizations. We saw how we can manipulate the size and color of the graphs we created using the opts function. Similarly, we can create different types of visualizations using holoview and visualize different datasets easily with highly interactive and visually appealing graphs and plots.

Provide your comments below

comments


If you loved this story, do join our .


Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.