Physical Address
Haryana ,India
Physical Address
Haryana ,India

When analyzing real-world datasets, understanding the distribution of continuous variables is crucial. The histogram plot is one of the most powerful visual tools for this purpose. In this advanced guide, we will show you how to create, customize, and enhance histogram plots using Seaborn’s built-in penguins dataset. We will also explore all key parameters of the sns.histplot() function to take your visualizations to a professional and publication-ready level.
Let’s begin by loading Seaborn’s famous penguins dataset and preparing it for plotting.
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Load the built-in penguins dataset
df = sns.load_dataset("penguins")
# Drop missing values for clean plotting
df = df.dropna()
# Set a modern Seaborn theme
sns.set_theme(style="whitegrid", palette="deep", font_scale=1.1)
The penguins dataset contains measurements for different species of penguins including:
species (Adelie, Gentoo, Chinstrap)islandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsexWe’ll use these continuous variables to create advanced histograms.
Let’s start with a simple histogram showing the distribution of penguin body masses.
sns.histplot(data=df, x="body_mass_g")
plt.title("Distribution of Penguin Body Mass")
plt.xlabel("Body Mass (g)")
plt.ylabel("Count")
plt.show()
This basic plot provides an overview of how penguin body mass is distributed, but we can do much more to refine and enrich the visualization.
Seaborn’s histplot() function is incredibly flexible.
The full signature is:
seaborn.histplot(
data=None, *,
x=None, y=None, hue=None, weights=None, stat='count', bins='auto',
binwidth=None, discrete=None, cumulative=False, common_bins=True,
common_norm=True, multiple='layer', element='bars', fill=True,
shrink=1, kde=False, kde_kws=None, line_kws=None, color=None,
palette=None, hue_order=None, hue_norm=None, log_scale=None,
legend=True, ax=None, **kwargs
)
Let’s build a comprehensive example using most of these parameters in a meaningful way.
We will visualize how penguin body mass is distributed across species and sex, adding a KDE curve, custom bins, and style enhancements.
plt.figure(figsize=(12,6))
sns.histplot(
data=df,
x="body_mass_g", # Continuous variable for the x-axis
hue="species", # Different species shown by color
multiple="stack", # Overlay style: stack, layer, dodge, fill
bins=25, # Number of bins
binwidth=None, # Can specify exact bin width instead of count
stat="density", # Options: count, frequency, probability, density
kde=True, # Show smooth density curve
kde_kws={"linewidth":2, "alpha":0.7}, # KDE line settings
element="bars", # Draw as bars (can be 'bars', 'step', 'poly')
fill=True, # Fill bars with color
common_bins=True, # Use same bins for all subsets
common_norm=False, # Normalize densities independently for each hue
palette="Set2", # Color palette
shrink=0.9, # Reduce bar width for spacing
alpha=0.8, # Transparency
legend=True, # Display legend
linewidth=1.2, # Border width of bars
edgecolor="white", # Edge color for clean separation
)
plt.title("Distribution of Penguin Body Mass by Species", fontsize=14, fontweight="bold")
plt.xlabel("Body Mass (grams)")
plt.ylabel("Density")
plt.legend(title="Species", bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
plt.show()
hue="species" → Different species represented by different colors.multiple="stack" → Bars stacked on top of each other (use "dodge" for side-by-side comparison).bins=25 → Histogram divided into 25 bins.stat='density' → Normalizes the histogram to show relative distribution.kde=True → Adds a smooth density estimate curve.element='bars' → Bars as the visual element (alternatives: 'poly', 'step').palette='Set2' → Soft color palette for professional look.fill=True → Filled bars for visual clarity.shrink=0.9 → Slightly narrows bars to add spacing.common_norm=False → Each species normalized separately.edgecolor='white' → Separates bars clearly for cleaner visuals.linewidth=1.2 → Slight outline for better contrast.Seaborn’s FacetGrid allows creating histograms by categories (e.g., species and sex) in one figure.
g = sns.FacetGrid(df, col="species", row="sex", margin_titles=True, height=3.5, aspect=1.3)
g.map_dataframe(
sns.histplot,
x="flipper_length_mm",
bins=20,
color="#1f77b4",
kde=True,
stat="count",
edgecolor="black",
alpha=0.8
)
g.set_axis_labels("Flipper Length (mm)", "Count")
g.fig.suptitle("Distribution of Flipper Length by Species and Sex", fontsize=15, fontweight="bold", y=1.05)
plt.show()
If your data spans wide ranges, using a logarithmic scale or custom bin edges enhances visibility.
plt.figure(figsize=(10,5))
sns.histplot(
data=df,
x="flipper_length_mm",
hue="species",
bins=[170,180,190,200,210,220,230,240],
log_scale=(False, True), # Log scale for y-axis
stat="count",
multiple="dodge",
palette="coolwarm",
element="bars",
linewidth=1,
edgecolor="black"
)
plt.title("Flipper Length Distribution with Custom Bins", fontsize=13, fontweight="bold")
plt.xlabel("Flipper Length (mm)")
plt.ylabel("Count (Log Scale)")
plt.tight_layout()
plt.show()
bins=[...] → Defines exact bin edges manually.log_scale=(False, True) → Applies log scale to y-axis only.multiple='dodge' → Bars for different species shown side-by-side.To further emphasize individual data points, overlay a rug plot.
plt.figure(figsize=(10,5))
sns.histplot(
data=df,
x="bill_depth_mm",
bins=20,
color="skyblue",
stat="density",
kde=True,
edgecolor="black",
alpha=0.7
)
sns.rugplot(data=df, x="bill_depth_mm", height=0.05, color="darkblue")
plt.title("Bill Depth Distribution with KDE and Rug Plot", fontsize=13, fontweight="bold")
plt.xlabel("Bill Depth (mm)")
plt.ylabel("Density")
plt.show()
This adds a subtle layer of granularity—each rug mark represents one observation along the x-axis.
Let’s consider a wildlife research scenario:
You are studying how body mass distributions differ among penguin species to estimate feeding habits and energy needs.
The advanced histogram we created can help biologists:
By using hue, density normalization, and KDE overlays, we can easily interpret data that would otherwise be overwhelming in tabular format.
Always save your final figure in publication-quality format.
plt.savefig("penguins_histogram.png", dpi=300, bbox_inches='tight')
Use .png for high quality images or .svg for vector graphics (ideal for reports).
✅ sns.histplot() combines power and simplicity for advanced histogram visualizations.
✅ Use hue, multiple, kde, and element for rich, multi-dimensional plots.
✅ Faceting is perfect for comparisons across multiple groups.
✅ Custom bins and log scales make your plots precise and insightful.
✅ Combine KDE and rug plots for distributional storytelling.
✅ Always export plots in high resolution for dashboards or publications.
By mastering these techniques, you’ll be able to create publication-ready, data-rich histogram plots that are both visually stunning and analytically powerful — perfect for professional analytics, academic research, or executive dashboards.