SEABORN
The examples below are using the dataset(s) described by the snippit below.
In [3]:
census_data = pd.read_csv('census_data.csv')
census_data.head(4)
# census_data.columns
Out[3]:
CensusId State County TotalPop Men Women Hispanic White Black Native ... Walk OtherTransp WorkAtHome MeanCommute Employed PrivateWork PublicWork SelfEmployed FamilyWork Unemployment
0 1001 Alabama Autauga 55221 26745 28476 2.6 75.8 18.5 0.4 ... 0.5 1.3 1.8 26.5 23986 73.6 20.9 5.5 0.0 7.6
1 1003 Alabama Baldwin 195121 95314 99807 4.5 83.1 9.5 0.6 ... 1.0 1.4 3.9 26.4 85953 81.5 12.3 5.8 0.4 7.5
2 1005 Alabama Barbour 26932 14497 12435 4.6 46.2 46.7 0.2 ... 1.8 1.5 1.6 24.1 8597 71.8 20.8 7.3 0.1 17.6
3 1007 Alabama Bibb 22604 12073 10531 2.2 74.5 21.4 0.4 ... 0.6 1.5 0.7 28.8 8294 76.8 16.1 6.7 0.4 8.3

4 rows × 37 columns



In [1]:
data.head()
Out[1]:
room_id host_id room_type neighborhood reviews overall_satisfaction accommodates bedrooms price minstay
0 5453 8021 Private room Jamaica Plain 53 5.0 2 1.0 171.0 1
1 5506 8229 Private room Roxbury 30 4.5 2 1.0 165.0 3
2 6695 8229 Entire home/apt Roxbury 39 5.0 4 1.0 222.0 3
3 6976 16701 Private room Roslindale 26 5.0 2 1.0 74.0 1
4 8789 26988 Entire home/apt Downtown 1 5.0 2 1.0 165.0 5


Categorical Plots


Bar Plot

Bar plots are a simple way to see how different categories stack up against each other based on a column value.

# Bar Plot 
# looking at the average of each type of column neighborhood
by_neighborhood = data.groupby('neighborhood').mean()
plt.figure(figsize=(8,4))
axis = sns.barplot(by_neighborhood.index, by_neighborhood['price'])
axis.set_xticklabels(axis.get_xticklabels(), rotation=40, ha="right")
plt.tight_layout()

Point Plot

This is a line graph looking at the average price of each neighborhood.

# point plot
# looking at the average of each type of column neighborhood
plt.figure(figsize=(7,4))
axis = sns.pointplot(by_neighborhood.index, by_neighborhood['price'])
axis.set_xticklabels(axis.get_xticklabels(), rotation=40, ha="right")
plt.tight_layout()

Box Plot

This graph is looking at the average price for each neighborhood. Gives an idea of where the range of prices fall under and to see where there are outliers.

# box plot
plt.figure(figsize=(8,6))
axis = sns.boxplot(x='price', y='neighborhood', data=data) # hue='room_type'
axis.set_yticklabels(axis.get_yticklabels(), rotation=30, ha="right")
plt.tight_layout()

Distribution Plots


Dist Plot

Like its name implies this plot looks at the overall distribution of a specific column. Here is the distribution of the price column.

# dist plot
plt.figure(figsize=(8,4))
axis = sns.distplot(data['price'])  # kde=false for max likelihood gaussian distribution fit

Regression Plots


Reg Plot

Placing a regression line comparing two columns.

plt.figure(figsize=(8,4))
axis = sns.regplot(x='Poverty', y='Unemployment', data=census_data)

Interactive Shell

Here is an interactive shell to try out the methods discussed on a dataframe

# this gets executed each time the exercise is initialized import pandas as pd data = pd.DataFrame({'price': {'35': 126.0, '64': 394.0, '109': 108.0, '117': 74.0, '146': 171.0, '148': 314.0, '165': 74.0, '262': 56.0, '281': 165.0, '311': 102.0, '325': 23.0, '343': 86.0, '356': 191.0, '358': 102.0, '405': 171.0, '414': 86.0, '426': 124.0, '449': 66.0, '483': 80.0, '511': 399.0, '517': 171.0, '529': 63.0, '555': 97.0, '585': 66.0, '586': 257.0, '623': 80.0, '643': 257.0, '649': 365.0, '656': 131.0, '666': 86.0, '726': 285.0, '733': 365.0, '741': 68.0, '747': 113.0, '749': 131.0, '764': 366.0, '768': 371.0, '816': 204.0, '883': 204.0, '914': 799.0, '929': 257.0, '1030': 80.0, '1038': 171.0, '1112': 183.0, '1143': 143.0, '1158': 148.0, '1166': 86.0, '1173': 97.0, '1175': 160.0, '1208': 227.0}, 'host_id': {'35': 85770, '64': 25188, '109': 1092168, '117': 1176995, '146': 1444340, '148': 411715, '165': 2053557, '262': 4480671, '281': 1590532, '311': 848706, '325': 1651480, '343': 2254462, '356': 1444340, '358': 6549153, '405': 7658308, '414': 7673571, '426': 8085264, '449': 6608084, '483': 9287212, '511': 9617517, '517': 6469268, '529': 9663343, '555': 1037913, '585': 2621102, '586': 10750832, '623': 4232125, '643': 10250257, '649': 12220952, '656': 5573822, '666': 3875741, '726': 12985903, '733': 13929879, '741': 14050476, '747': 324630, '749': 1362285, '764': 25188, '768': 50866, '816': 3629953, '883': 13325723, '914': 10502779, '929': 7629859, '1030': 18165984, '1038': 8812693, '1112': 9258077, '1143': 7000428, '1158': 1805868, '1166': 1480518, '1173': 1805868, '1175': 2830216, '1208': 21063555}, 'minstay': {'35': 1, '64': 4, '109': 2, '117': 2, '146': 2, '148': 1, '165': 3, '262': 7, '281': 2, '311': 1, '325': 1, '343': 1, '356': 1, '358': 1, '405': 1, '414': 2, '426': 1, '449': 1, '483': 5, '511': 2, '517': 2, '529': 1, '555': 2, '585': 1, '586': 1, '623': 2, '643': 3, '649': 1, '656': 1, '666': 1, '726': 1, '733': 2, '741': 2, '747': 1, '749': 1, '764': 7, '768': 2, '816': 1, '883': 2, '914': 4, '929': 2, '1030': 1, '1038': 1, '1112': 1, '1143': 3, '1158': 2, '1166': 1, '1173': 1, '1175': 1, '1208': 2}, 'reviews': {'35': 123, '64': 4, '109': 52, '117': 19, '146': 22, '148': 30, '165': 101, '262': 63, '281': 16, '311': 23, '325': 12, '343': 12, '356': 18, '358': 29, '405': 25, '414': 24, '426': 47, '449': 47, '483': 1, '511': 1, '517': 1, '529': 63, '555': 8, '585': 6, '586': 11, '623': 7, '643': 1, '649': 5, '656': 3, '666': 11, '726': 8, '733': 7, '741': 12, '747': 42, '749': 1, '764': 1, '768': 1, '816': 12, '883': 7, '914': 3, '929': 13, '1030': 4, '1038': 2, '1112': 5, '1143': 1, '1158': 1, '1166': 5, '1173': 2, '1175': 3, '1208': 2}, 'room_id': {'35': 22354, '64': 54215, '109': 211921, '117': 225834, '146': 349347, '148': 350205, '165': 447826, '262': 856876, '281': 935554, '311': 1067184, '325': 1106555, '343': 1166808, '356': 1197857, '358': 1198779, '405': 1422837, '414': 1471308, '426': 1514227, '449': 1584362, '483': 1767672, '511': 1840255, '517': 1860782, '529': 1884045, '555': 1977951, '585': 2088320, '586': 2108738, '623': 2268196, '643': 2376518, '649': 2392404, '656': 2426468, '666': 2473997, '726': 2694019, '733': 2722165, '741': 2754149, '747': 2776143, '749': 2777752, '764': 2821921, '768': 2831504, '816': 2979108, '883': 3244362, '914': 3351728, '929': 3377100, '1030': 3678429, '1038': 3704801, '1112': 3894320, '1143': 3938428, '1158': 3969867, '1166': 3987926, '1173': 3997572, '1175': 4004152, '1208': 4061059}, 'bedrooms': {'35': 1.0, '64': 2.0, '109': 1.0, '117': 1.0, '146': 1.0, '148': 1.0, '165': 1.0, '262': 1.0, '281': 1.0, '311': 1.0, '325': 1.0, '343': 1.0, '356': 1.0, '358': 1.0, '405': 0.0, '414': 1.0, '426': 1.0, '449': 1.0, '483': 1.0, '511': 2.0, '517': 0.0, '529': 1.0, '555': 1.0, '585': 1.0, '586': 0.0, '623': 1.0, '643': 1.0, '649': 2.0, '656': 0.0, '666': 1.0, '726': 1.0, '733': 2.0, '741': 1.0, '747': 0.0, '749': 1.0, '764': 2.0, '768': 2.0, '816': 1.0, '883': 1.0, '914': 5.0, '929': 1.0, '1030': 1.0, '1038': 1.0, '1112': 1.0, '1143': 1.0, '1158': 2.0, '1166': 1.0, '1173': 1.0, '1175': 1.0, '1208': 0.0}, 'room_type': {'35': 'Private room', '64': 'Entire home/apt', '109': 'Entire home/apt', '117': 'Private room', '146': 'Entire home/apt', '148': 'Entire home/apt', '165': 'Private room', '262': 'Shared room', '281': 'Private room', '311': 'Private room', '325': 'Shared room', '343': 'Shared room', '356': 'Entire home/apt', '358': 'Private room', '405': 'Entire home/apt', '414': 'Private room', '426': 'Private room', '449': 'Private room', '483': 'Private room', '511': 'Entire home/apt', '517': 'Entire home/apt', '529': 'Private room', '555': 'Private room', '585': 'Private room', '586': 'Entire home/apt', '623': 'Private room', '643': 'Entire home/apt', '649': 'Entire home/apt', '656': 'Entire home/apt', '666': 'Private room', '726': 'Entire home/apt', '733': 'Entire home/apt', '741': 'Private room', '747': 'Entire home/apt', '749': 'Private room', '764': 'Entire home/apt', '768': 'Entire home/apt', '816': 'Entire home/apt', '883': 'Private room', '914': 'Entire home/apt', '929': 'Entire home/apt', '1030': 'Shared room', '1038': 'Entire home/apt', '1112': 'Entire home/apt', '1143': 'Private room', '1158': 'Entire home/apt', '1166': 'Private room', '1173': 'Private room', '1175': 'Entire home/apt', '1208': 'Entire home/apt'}, 'accommodates': {'35': 1, '64': 4, '109': 5, '117': 2, '146': 4, '148': 5, '165': 2, '262': 2, '281': 2, '311': 2, '325': 16, '343': 2, '356': 4, '358': 3, '405': 2, '414': 2, '426': 2, '449': 3, '483': 1, '511': 4, '517': 2, '529': 2, '555': 4, '585': 1, '586': 2, '623': 2, '643': 4, '649': 6, '656': 2, '666': 3, '726': 3, '733': 4, '741': 2, '747': 4, '749': 4, '764': 4, '768': 6, '816': 2, '883': 2, '914': 8, '929': 2, '1030': 1, '1038': 4, '1112': 2, '1143': 2, '1158': 3, '1166': 2, '1173': 2, '1175': 2, '1208': 6}, 'neighborhood': {'35': 'South End', '64': 'Fenway', '109': 'Roslindale', '117': 'Roslindale', '146': 'South End', '148': 'Fenway', '165': 'Jamaica Plain', '262': 'Fenway', '281': 'South End', '311': 'Jamaica Plain', '325': 'South End', '343': 'South End', '356': 'Roxbury', '358': 'South Boston', '405': 'South End', '414': 'Mission Hill', '426': 'East Boston', '449': 'Dorchester', '483': 'Jamaica Plain', '511': 'Fenway', '517': 'Longwood Medical Area', '529': 'Roxbury', '555': 'Roxbury', '585': 'Jamaica Plain', '586': 'Back Bay', '623': 'East Boston', '643': 'North End', '649': 'Charlestown', '656': 'Beacon Hill', '666': 'Roxbury', '726': 'South End', '733': 'Chinatown', '741': 'Mission Hill', '747': 'Beacon Hill', '749': 'West Roxbury', '764': 'Downtown', '768': 'North End', '816': 'Fenway', '883': 'South End', '914': 'Beacon Hill', '929': 'Beacon Hill', '1030': 'Dorchester', '1038': 'Back Bay', '1112': 'South Boston', '1143': 'Beacon Hill', '1158': 'Jamaica Plain', '1166': 'Jamaica Plain', '1173': 'Jamaica Plain', '1175': 'South End', '1208': 'Allston'}, 'overall_satisfaction': {'35': 4.5, '64': 4.0, '109': 5.0, '117': 5.0, '146': 4.5, '148': 5.0, '165': 5.0, '262': 4.5, '281': 5.0, '311': 4.5, '325': 5.0, '343': 4.5, '356': 4.5, '358': 4.0, '405': 4.5, '414': 4.0, '426': 5.0, '449': 5.0, '483': 5.0, '511': 5.0, '517': 5.0, '529': 4.5, '555': 4.5, '585': 5.0, '586': 4.5, '623': 5.0, '643': 5.0, '649': 5.0, '656': 4.5, '666': 4.5, '726': 5.0, '733': 4.5, '741': 5.0, '747': 4.0, '749': 4.0, '764': 5.0, '768': 5.0, '816': 5.0, '883': 4.5, '914': 5.0, '929': 5.0, '1030': 5.0, '1038': 4.5, '1112': 4.5, '1143': 4.0, '1158': 5.0, '1166': 5.0, '1173': 5.0, '1175': 4.5, '1208': 5.0}}) # dataframe name is data and only the first dataframe is available # not needed right now