Site icon DataDevX

How to Manually Create Legend in Pandas

How to Manually Create Legend in Pandas

How to Manually Create Legend in Pandas

When working with data visualization in Python, especially using Pandas with Matplotlib, one of the most important elements for clarity is the legend. A legend helps us label and identify different data categories or plot elements, ensuring the visualization is understandable. While Pandas provides automatic legends when plotting with DataFrame.plot(), sometimes we need to manually create legends for more control and customization.

In this article, we will explore step-by-step techniques to manually create legends in Pandas, providing examples, detailed explanations, and advanced customization methods.


Why Manually Create a Legend in Pandas?

Although Pandas integrates well with Matplotlib, the default legend might not always meet our requirements. Manually creating a legend becomes necessary when:


Basic Setup: Pandas with Matplotlib

Before diving into manual legends, let us set up a simple dataset and visualization environment.

import pandas as pd
import matplotlib.pyplot as plt

# Sample dataset
data = {
    "Month": ["Jan", "Feb", "Mar", "Apr", "May"],
    "Sales_A": [120, 150, 180, 200, 220],
    "Sales_B": [100, 130, 160, 190, 210]
}

df = pd.DataFrame(data)

# Plotting with Pandas
ax = df.plot(x="Month", y=["Sales_A", "Sales_B"], marker="o")
plt.show()

By default, Pandas will generate a legend showing Sales_A and Sales_B. But what if we want to manually control this legend?


Method 1: Adding a Custom Legend Manually

We can bypass the automatic legend and use Matplotlib’s plt.legend() method.

ax = df.plot(x="Month", y=["Sales_A", "Sales_B"], marker="o", legend=False)

# Manually creating legend
plt.legend(["Product A", "Product B"], loc="upper left")
plt.title("Monthly Sales Comparison")
plt.xlabel("Month")
plt.ylabel("Sales")
plt.show()

Here:


Method 2: Using Line Handles for More Control

Sometimes, we want to link specific line objects to custom labels. This gives us finer control.

ax = df.plot(x="Month", y=["Sales_A", "Sales_B"], marker="o", legend=False)

# Extract line objects
lines = ax.get_lines()

# Create legend with custom labels
plt.legend([lines[0], lines[1]], ["Product A", "Product B"], loc="best")
plt.show()

This method ensures the legend directly references the plotted line objects, making it easier to style or reorder later.


Method 3: Adding Legends for Multiple Figures

When working with multiple DataFrames or plots on the same axes, Pandas may overwrite legends. Manual legends solve this problem.

# Additional dataset
df2 = pd.DataFrame({
    "Month": ["Jan", "Feb", "Mar", "Apr", "May"],
    "Sales_C": [90, 120, 150, 170, 200]
})

ax = df.plot(x="Month", y="Sales_A", marker="o", legend=False)
df2.plot(x="Month", y="Sales_C", marker="s", ax=ax, legend=False)

# Manual legend for both
plt.legend(["Product A", "Product C"], loc="upper left")
plt.show()

Here, we merged multiple plots into one legend by controlling labels manually.


Method 4: Customizing Legend Appearance

We can modify fonts, colors, markers, and positioning for a professional look.

ax = df.plot(x="Month", y=["Sales_A", "Sales_B"], marker="o", legend=False)

plt.legend(
    ["Product A", "Product B"],
    loc="upper left",
    fontsize=12,
    title="Legend",
    title_fontsize=14,
    frameon=True,
    shadow=True,
    facecolor="lightgray"
)
plt.show()

Key options:


Method 5: Using Patches for Category Legends

When plotting categorical data such as bars or groups, we can create legends using matplotlib.patches.Patch.

import matplotlib.patches as mpatches

ax = df.plot(x="Month", y=["Sales_A", "Sales_B"], kind="bar", legend=False)

# Custom patches
patch1 = mpatches.Patch(color="tab:blue", label="Product A")
patch2 = mpatches.Patch(color="tab:orange", label="Product B")

plt.legend(handles=[patch1, patch2], loc="upper left")
plt.show()

This approach is particularly useful for bar charts, pie charts, or stacked plots, where manual labeling ensures clarity.


Method 6: Creating a Legend Outside the Plot

Sometimes, we need the legend outside the plotting area to avoid overlapping with data.

ax = df.plot(x="Month", y=["Sales_A", "Sales_B"], marker="o", legend=False)

plt.legend(
    ["Product A", "Product B"],
    bbox_to_anchor=(1.05, 1), 
    loc="upper left",
    borderaxespad=0.
)
plt.show()

Here, bbox_to_anchor moves the legend outside the figure boundary, making the chart cleaner.


Best Practices for Manual Legends in Pandas

To make the most out of manual legends:


Advanced Example: Fully Customized Multi-Plot Legend

fig, ax = plt.subplots()

df.plot(x="Month", y="Sales_A", marker="o", ax=ax, legend=False, color="blue")
df.plot(x="Month", y="Sales_B", marker="s", ax=ax, legend=False, color="red")
df2.plot(x="Month", y="Sales_C", marker="^", ax=ax, legend=False, color="green")

# Custom legend handles
line1, = ax.plot([], [], color="blue", marker="o", label="Product A")
line2, = ax.plot([], [], color="red", marker="s", label="Product B")
line3, = ax.plot([], [], color="green", marker="^", label="Product C")

plt.legend(handles=[line1, line2, line3], title="Sales by Product", loc="upper left")
plt.title("Customized Legend Example")
plt.show()

This method allows full control over markers, colors, and legend formatting, making it highly flexible for professional reporting and dashboards.


Conclusion

Manually creating legends in Pandas with Matplotlib provides the flexibility to build clear, professional, and customizable data visualizations. By controlling labels, handles, placement, and styles, we ensure that our plots effectively communicate insights without confusion. Whether working with simple line plots or complex multi-plot figures, manual legends give us the power to design charts that align with professional reporting standards.

Exit mobile version