Seaborn Sample Project¶

# Imports
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

# Load built-in iris dataset
iris = sns.load_dataset("iris")
iris.head()

Describe()¶

describe() is a very useful method in Pandas as it generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset distribution, excluding NaN values.

iris.describe()

Swarm Plot¶

sns.set()
%matplotlib inline

sns.swarmplot(x="species", y="petal_length", data=iris)

<matplotlib.axes._subplots.AxesSubplot at 0x1a581fa18d0>

Load fatal police shootings data¶

df = pd.read_csv("fatal-police-shootings-data.csv", encoding="windows-1252")
df.head(10)

df.describe()

Strip Plot¶

This plot is known as a Strip plot and pretty ideal for categorical values

sns.stripplot(x="armed", y="age", data=df)

<matplotlib.axes._subplots.AxesSubplot at 0x1a5820bba20>

tips = sns.load_dataset("tips")
tips.head(10)

tips.describe()

Bar Plot¶

sns.barplot(x="day", y="total_bill", data=tips)

C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval

<matplotlib.axes._subplots.AxesSubplot at 0x1a583698c50>

Styling with Seaborn¶

Seaborn splits Matplotlib parameters into two independent groups: First group sets the aesthetic style of the plot; and second scales various elements of the figure to get easily incorporated into different contexts. Seaborn doesn’t take away any of Matplotlib credits, but rather adds some nice default aesthetics and built-in plots that complement and sometimes replace the complicated Matplotlib code professionals needed to write. Facet plots and Regression plots are an example of that.

sns.set_style("whitegrid")
sns.boxplot(x="day", y="total_bill", data=tips)

<matplotlib.axes._subplots.AxesSubplot at 0x1a5837585f8>

sns.set_style("ticks")
sns.boxplot(x="day", y="total_bill", data=tips)

<matplotlib.axes._subplots.AxesSubplot at 0x1a582156940>

sns.set_style("white")
sns.boxplot(x="day", y="total_bill", data=tips)

<matplotlib.axes._subplots.AxesSubplot at 0x1a583863c50>

sns.set_style("dark")
sns.boxplot(x="day", y="total_bill", data=tips)

<matplotlib.axes._subplots.AxesSubplot at 0x1a5838e8eb8>

sns.set_style("ticks")
sns.boxplot(x="day", y="total_bill", data=tips)
sns.despine()

sns.set_style("ticks")
sns.boxplot(x="day", y="total_bill", data=tips)
sns.despine(left=True)

Visualize two types of background in a single plot¶

# This function will help us plot some offset since waves
def sinplot(flip=1):
    x = np.linspace(0, 14, 100)
    for i in range(1, 7):
        plt.plot(x, np.sin(x + i * 0.5) * (7 - i) * flip)
        
with sns.axes_style("darkgrid"):
    plt.subplot(211)
    sinplot()
plt.subplot(212)
sinplot(-1)

Scaling of plot elements¶

sns.set()
sns.set_context("paper")
sns.set_style("whitegrid")
sns.boxplot(x="day", y="total_bill", data=tips)

<matplotlib.axes._subplots.AxesSubplot at 0x1a583b05828>

I am pretty sure you must be thinking that this figure/plot in no ways is scaled as it looks similar to our previous plot outputs. So, I shall clarify that right away: Jupyter Notebook scales down large images in the notebook cell output. This is generally done because past a certain size, we get automatic figure scaling. For exploratory analysis, we prefer iterating quickly over a number of different analyses and it’s more useful to have facets that are of similar size; than to have overall figures that are same size in a particular context. When we’re in a situation where we need to have something that’s exactly a certain size overall; ideally we:

Know precisely what we want and
Can afford to take off some time and work through the calculations

With all that being said, if we plot the same figure in an Editor like Anaconda Spyder or JetBrains’ PyCharm or IntelliJ, we shall be able to visualize them in their original size. Hence what needs to be our take-away from scaling segment, is that an addition of a line of code can fetch the size of image as per our requirement and we may experiment accordingly. In practical world, we can also add a dictionary of parameters using rc to have a finer control over the aesthetics. Let me show you an example with the same sinplot function we defined earlier:

sns.set(style="whitegrid", rc={"grid.linewidth": 1.5})
sns.set_context("poster", font_scale=2.5, rc={"lines.linewidth": 5.0})
sinplot()

Though our Notebook didn’t display enlarged (scaled) plot, we may notice how in the backend (in memory) it has created the figure as per our instructions. We have thick lines now in our plot because I set linewidth to 5, font size on axes have thickened because of font_scale. Generally we don't use anything more than that during data analysis although exceptional scenarios may demand few more parameters as per requirement which we will slowly taking care of in our next next article of this series.

	sepal_length	sepal_width	petal_length	petal_width	species
0	5.1	3.5	1.4	0.2	setosa
1	4.9	3.0	1.4	0.2	setosa
2	4.7	3.2	1.3	0.2	setosa
3	4.6	3.1	1.5	0.2	setosa
4	5.0	3.6	1.4	0.2	setosa

	sepal_length	sepal_width	petal_length	petal_width
count	150.000000	150.000000	150.000000	150.000000
mean	5.843333	3.057333	3.758000	1.199333
std	0.828066	0.435866	1.765298	0.762238
min	4.300000	2.000000	1.000000	0.100000
25%	5.100000	2.800000	1.600000	0.300000
50%	5.800000	3.000000	4.350000	1.300000
75%	6.400000	3.300000	5.100000	1.800000
max	7.900000	4.400000	6.900000	2.500000

	id	name	date	manner_of_death	armed	age	gender	race	city	state	signs_of_mental_illness	threat_level	flee	body_camera
0	3	Tim Elliot	2015-01-02	shot	gun	53.0	M	A	Shelton	WA	True	attack	Not fleeing	False
1	4	Lewis Lee Lembke	2015-01-02	shot	gun	47.0	M	W	Aloha	OR	False	attack	Not fleeing	False
2	5	John Paul Quintero	2015-01-03	shot and Tasered	unarmed	23.0	M	H	Wichita	KS	False	other	Not fleeing	False
3	8	Matthew Hoffman	2015-01-04	shot	toy weapon	32.0	M	W	San Francisco	CA	True	attack	Not fleeing	False
4	9	Michael Rodriguez	2015-01-04	shot	nail gun	39.0	M	H	Evans	CO	False	attack	Not fleeing	False
5	11	Kenneth Joe Brown	2015-01-04	shot	gun	18.0	M	W	Guthrie	OK	False	attack	Not fleeing	False
6	13	Kenneth Arnold Buck	2015-01-05	shot	gun	22.0	M	H	Chandler	AZ	False	attack	Car	False
7	15	Brock Nichols	2015-01-06	shot	gun	35.0	M	W	Assaria	KS	False	attack	Not fleeing	False
8	16	Autumn Steele	2015-01-06	shot	unarmed	34.0	F	W	Burlington	IA	False	other	Not fleeing	True
9	17	Leslie Sapp III	2015-01-06	shot	toy weapon	47.0	M	B	Knoxville	PA	False	attack	Not fleeing	False

	id	age
count	4022.000000	3870.000000
mean	2252.446295	36.884496
std	1259.731835	13.126454
min	3.000000	6.000000
25%	1162.250000	27.000000
50%	2241.500000	35.000000
75%	3346.750000	45.000000
max	4432.000000	91.000000

	total_bill	tip	sex	smoker	day	time	size
0	16.99	1.01	Female	No	Sun	Dinner	2
1	10.34	1.66	Male	No	Sun	Dinner	3
2	21.01	3.50	Male	No	Sun	Dinner	3
3	23.68	3.31	Male	No	Sun	Dinner	2
4	24.59	3.61	Female	No	Sun	Dinner	4
5	25.29	4.71	Male	No	Sun	Dinner	4
6	8.77	2.00	Male	No	Sun	Dinner	2
7	26.88	3.12	Male	No	Sun	Dinner	4
8	15.04	1.96	Male	No	Sun	Dinner	2
9	14.78	3.23	Male	No	Sun	Dinner	2

	total_bill	tip	size
count	244.000000	244.000000	244.000000
mean	19.785943	2.998279	2.569672
std	8.902412	1.383638	0.951100
min	3.070000	1.000000	1.000000
25%	13.347500	2.000000	2.000000
50%	17.795000	2.900000	2.000000
75%	24.127500	3.562500	3.000000
max	50.810000	10.000000	6.000000