Explain Codes LogoExplain Codes Logo

Scatter plot with different text at each data point

python
matplotlib
scatter-plot
data-labeling
Alex KataevbyAlex Kataev·Aug 28, 2024
TLDR

Need to whip up an annotated scatter plot? Piece of cake! Here's a quick recipe 🧑‍🍳:

import matplotlib.pyplot as plt x, y = [1, 2, 3, 4], [10, 15, 20, 25] # The secret ingredients 🍎🍊🍋🍒 labels = ['A', 'B', 'C', 'D'] # Spicing things up with some variety 🌶️ plt.scatter(x, y) # Voilà, the blueprints for our masterpiece # The icing on the cake: text labels for each data point for x_coord, y_coord, label in zip(x, y, labels): plt.text(x_coord, y_coord, label) plt.show() # End result: a delicious spectacle 😋

Bon appétit, folks! You've cooked up a scatter plot with distinctive labels for each data point.

Styling annotations

Labels are more than just identifiers; they convey potent information. Matplotlib provides tools to "prettify" these labels—let's mess around with them (It's like "Dress-Up", but for coders):

for x_coord, y_coord, label in zip(x, y, labels): plt.text(x_coord, y_coord, label, fontsize=9, color='purple', ha='right') # Flexing our dressing skills

Boom! Our guests (data points) are not only identifiable, but also stylish. They now say, "We've got names and we aren't afraid to show them".

Tackling label overlaps

Occasionally, labels can overlap, especially when they've had too much to drink (aka when the points are densely crowded). Solution: have the labels social-distance from their points using arrows:

for i, label in enumerate(labels): plt.annotate(label, (x[i], y[i]), xytext=(5, 5), textcoords='offset points', ha='left', va='bottom', arrowprops=dict(arrowstyle='->', connectionstyle='arc3,rad=0'))

Much like social distancing, this type of formatting keeps the "conversation" between the labels and their points crystal-clear.

Multiple plots, one figure

Need to compare plots? Borrow a leaf from apartment design—maximize your space with subplots:

fig, axs = plt.subplots(2) # We're going 2-stories! Nice! axs[0].scatter(x, y) # First floor scatter plot axs[1].scatter(y, x) # Second floor scatter plot (it's kind of upside down) labels_sub1 = [f'{lab}_1' for lab in labels] # Labels for the first floor labels_sub2 = [f'{lab}_2' for lab in labels] # Labels for the second floor # Label plots on both floors for i, (lab1, lab2) in enumerate(zip(labels_sub1, labels_sub2)): axs[0].annotate(lab1, (x[i], y[i])) axs[1].annotate(lab2, (y[i], x[i])) plt.show() # Visualizing our new duplex 🏘️

Two plots under one roof. How convenient is that?

Behind the scenes: 3D scatter plots

The world isn't flat, and sometimes, neither is your data. 3D scatter plots are here for when you want to go beyond mere X's and O's on a 2D plane:

from mpl_toolkits.mplot3d import Axes3D fig = plt.figure() ax = fig.add_subplot(111, projection='3d') x_3d, y_3d, z_3d = [1, 2, 3, 4], [10, 15, 20, 25], [5, 6, 7, 8] # Setting our stage in 3D labels_3d = ['A1', 'B1', 'C1', 'D1'] # 3D labels because we're fancy like that ax.scatter(x_3d, y_3d, z_3d) # 3D plot is a go! # Writing text on a 3D plot? Yep, we can do that for xyz, label in zip(zip(x_3d, y_3d, z_3d), labels_3d): ax.text(*xyz, label) plt.show() # 3D-licious!

Note: You must use glasses to fully appreciate the 3D effect (not really, but it would be cool, huh?).

Function-ize data labeling

If you think you'll use this scatter plot in the future with other parties (datasets), it's wise to make a function that can lay down the law regardless of the crowd:

def annotate_scatter(x_data, y_data, labels, ax): # We're going by the book here—each point gets a label for x_coord, y_coord, label in zip(x_data, y_data, labels): ax.text(x_coord, y_coord, label) # Say labels change with situations—we're ready! annotate_scatter(x, y, labels, plt.gca())

This way, even if the party theme changes, the function makes sure no one goes without a badge.

Enhanced annotation formatting

Want to stand out from other graphs at the party? You know what they say: "When in doubt, annotate in style!" 🎀:

for i, label in enumerate(labels): plt.annotate(label, (x[i], y[i]), xytext=(-15, 10), textcoords='offset points', ha='center', bbox=dict(boxstyle='round,pad=0.2', fc='yellow', alpha=0.3), arrowprops=dict(arrowstyle='->', connectionstyle='arc3,rad=0.5', color='red'))

This isn't just any scatter plot—it's the scatter plot everyone will be talking about at the party (of plots).