# Week 2D: Matplotlib Fundamentals - Creating Meaningful Visual Insights

## Welcome to Data Visualization with Matplotlib!

Welcome to your journey into data visualization with Matplotlib! Matplotlib is the foundational plotting library for Python, providing a comprehensive framework for creating static, animated, and interactive visualizations. As the saying goes, "A picture is worth a thousand data points" - and Matplotlib helps you create those pictures.

### Why Matplotlib?

Matplotlib has become the cornerstone of data visualization in Python because it offers:

- **Complete Control**: Fine-grained control over every aspect of your plots
- **Publication Quality**: Create figures suitable for academic papers and presentations
- **Extensive Plot Types**: From basic line plots to complex 3D visualizations
- **Integration**: Works seamlessly with NumPy, Pandas, and other scientific libraries
- **Customization**: Endless possibilities for customizing appearance and style

### What We'll Cover

In this comprehensive notebook, we'll explore:

1. **Basic Plotting** - Lines, markers, and simple visualizations
2. **Scatter Plots** - Visualizing relationships and correlations
3. **Histograms** - Understanding data distributions
4. **Bar Charts** - Comparing categorical data
5. **Multiple Subplots** - Creating dashboard-style visualizations
6. **Customization** - Colors, styles, annotations, and more
7. **Advanced Techniques** - 3D plots, animations, and interactivity

Let's begin creating meaningful visual insights!

## Part 1: Getting Started with Matplotlib

### Basic Setup and First Plot

Let's start with the fundamentals of creating plots with Matplotlib.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Set random seed for reproducibility
np.random.seed(42)

# Configure matplotlib for inline display in Jupyter
%matplotlib inline

# Optional: Set default figure size
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['figure.dpi'] = 80

print("Matplotlib version:", plt.matplotlib.__version__)
print("NumPy version:", np.__version__)
print("Pandas version:", pd.__version__)

### Your First Plot

In [None]:
# Simple line plot
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y)
plt.title('My First Plot: Sine Wave')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.grid(True, alpha=0.3)
plt.show()

print("Congratulations! You've created your first Matplotlib plot!")

### Understanding the Anatomy of a Plot

In [None]:
# Detailed plot with all major components labeled
fig, ax = plt.subplots(figsize=(12, 8))

# Generate data
x = np.linspace(0, 2*np.pi, 100)
y1 = np.sin(x)
y2 = np.cos(x)

# Plot data
ax.plot(x, y1, label='sin(x)', color='blue', linewidth=2)
ax.plot(x, y2, label='cos(x)', color='red', linewidth=2, linestyle='--')

# Add title and labels
ax.set_title('Anatomy of a Matplotlib Plot', fontsize=16, fontweight='bold')
ax.set_xlabel('X-axis Label', fontsize=12)
ax.set_ylabel('Y-axis Label', fontsize=12)

# Add legend
ax.legend(loc='upper right', fontsize=11)

# Add grid
ax.grid(True, alpha=0.3, linestyle=':')

# Add horizontal and vertical lines
ax.axhline(y=0, color='k', linewidth=0.5)
ax.axvline(x=np.pi, color='green', linewidth=1, linestyle='-.', label='x=Ï€')

# Add text annotation
ax.annotate('Peak', xy=(np.pi/2, 1), xytext=(np.pi/2 + 0.5, 1.2),
            arrowprops=dict(arrowstyle='->', color='black'),
            fontsize=11)

# Set axis limits
ax.set_xlim(0, 2*np.pi)
ax.set_ylim(-1.5, 1.5)

# Add x-axis tick labels
ax.set_xticks([0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi])
ax.set_xticklabels(['0', 'Ï€/2', 'Ï€', '3Ï€/2', '2Ï€'])

plt.tight_layout()
plt.show()

print("Key components: Figure, Axes, Title, Labels, Legend, Grid, Annotations")

### Line Plots with Multiple Series

In [None]:
# Multiple line plots with different styles
x = np.linspace(0, 10, 100)

plt.figure(figsize=(12, 6))

# Different line styles and markers
plt.plot(x, x, label='Linear', linestyle='-', color='blue', linewidth=2)
plt.plot(x, x**2/10, label='Quadratic', linestyle='--', color='red', linewidth=2)
plt.plot(x, np.sqrt(x)*3, label='Square Root', linestyle='-.', color='green', linewidth=2)
plt.plot(x, np.log(x+1)*3, label='Logarithmic', linestyle=':', color='purple', linewidth=2)

# Add markers to one line
x_markers = x[::10]  # Every 10th point
plt.plot(x_markers, np.sin(x_markers)*5 + 5, 'o-', label='Sine with markers', 
         color='orange', markersize=8, linewidth=1.5)

plt.title('Comparison of Different Functions', fontsize=14)
plt.xlabel('X values')
plt.ylabel('Y values')
plt.legend(loc='best')
plt.grid(True, alpha=0.3)

# Add shaded region
plt.fill_between(x, 0, np.sin(x)*5 + 5, where=(x > 4) & (x < 7), 
                 alpha=0.3, color='yellow', label='Highlighted region')

plt.show()

### ðŸŽ¯ Practice Exercise 1: Basic Plotting

Create your own line plot with multiple series:

In [None]:
# Exercise: Create a plot showing temperature variations
# TODO: Create a plot with:
# 1. Days of the month (1-30) on x-axis
# 2. Three cities' temperatures with different line styles
# 3. Proper labels, title, and legend
# 4. Highlight days where temperature > 30Â°C

# Your code here:
days = np.arange(1, 31)
# TODO: Generate temperature data for 3 cities
# TODO: Create the plot
# TODO: Add highlighting for hot days


# Test your solution (uncomment when ready):
# plt.show()

## Part 2: Scatter Plots - Visualizing Relationships

### Basic Scatter Plot

Scatter plots are essential for exploring relationships between two continuous variables.

In [None]:
# Generate correlated data
np.random.seed(42)
n_points = 100
x = np.random.randn(n_points)
y = 2 * x + np.random.randn(n_points) * 0.5

# Create scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(x, y, alpha=0.6, s=50)

# Add regression line
z = np.polyfit(x, y, 1)
p = np.poly1d(z)
plt.plot(np.sort(x), p(np.sort(x)), "r-", linewidth=2, label=f'y={z[0]:.2f}x+{z[1]:.2f}')

plt.xlabel('X Variable')
plt.ylabel('Y Variable')
plt.title('Scatter Plot with Linear Regression')
plt.legend()
plt.grid(True, alpha=0.3)

# Calculate and display correlation
correlation = np.corrcoef(x, y)[0, 1]
plt.text(0.05, 0.95, f'Correlation: {correlation:.3f}', 
         transform=plt.gca().transAxes, fontsize=12,
         bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))

plt.show()

### Advanced Scatter Plots with Color and Size Mapping

In [None]:
# Generate data with multiple features
n_points = 200
x = np.random.randn(n_points)
y = np.random.randn(n_points)
colors = np.random.rand(n_points)
sizes = np.random.randint(20, 500, n_points)

# Create figure with subplots
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Subplot 1: Color-coded scatter
scatter1 = axes[0].scatter(x, y, c=colors, cmap='viridis', 
                           alpha=0.6, s=100, edgecolors='black', linewidth=0.5)
axes[0].set_title('Scatter Plot with Color Mapping')
axes[0].set_xlabel('X Variable')
axes[0].set_ylabel('Y Variable')
axes[0].grid(True, alpha=0.3)
plt.colorbar(scatter1, ax=axes[0], label='Color Value')

# Subplot 2: Size-coded scatter
scatter2 = axes[1].scatter(x, y, c=colors, s=sizes, 
                           alpha=0.6, cmap='plasma', edgecolors='black', linewidth=0.5)
axes[1].set_title('Scatter Plot with Size and Color Mapping')
axes[1].set_xlabel('X Variable')
axes[1].set_ylabel('Y Variable')
axes[1].grid(True, alpha=0.3)
plt.colorbar(scatter2, ax=axes[1], label='Color Value')

# Add size legend
for size in [50, 200, 400]:
    axes[1].scatter([], [], s=size, c='gray', alpha=0.6, 
                   edgecolors='black', linewidth=0.5,
                   label=f'Size: {size}')
axes[1].legend(scatterpoints=1, frameon=True, labelspacing=2, title='Point Size')

plt.tight_layout()
plt.show()

### Bubble Chart - Multi-dimensional Data

In [None]:
# Create bubble chart data (e.g., countries)
countries = ['USA', 'China', 'Japan', 'Germany', 'India', 'UK', 'France', 'Brazil', 'Italy', 'Canada']
gdp_per_capita = [65000, 10500, 40200, 46500, 2100, 42300, 41500, 8700, 34500, 46300]
life_expectancy = [78.9, 76.9, 84.6, 81.3, 69.7, 81.3, 82.7, 75.9, 83.5, 82.4]
population = [331, 1439, 126, 83, 1380, 68, 65, 213, 60, 38]  # in millions

# Create bubble chart
plt.figure(figsize=(12, 8))

# Scale bubble sizes
bubble_sizes = [p * 3 for p in population]

# Create scatter plot with sized bubbles
scatter = plt.scatter(gdp_per_capita, life_expectancy, s=bubble_sizes, 
                     alpha=0.6, c=range(len(countries)), cmap='tab10',
                     edgecolors='black', linewidth=1)

# Add country labels
for i, country in enumerate(countries):
    plt.annotate(country, (gdp_per_capita[i], life_expectancy[i]),
                xytext=(5, 5), textcoords='offset points', fontsize=9)

plt.xlabel('GDP per Capita ($)', fontsize=12)
plt.ylabel('Life Expectancy (years)', fontsize=12)
plt.title('GDP vs Life Expectancy by Country (Bubble size = Population)', fontsize=14)
plt.grid(True, alpha=0.3)

# Add legend for bubble sizes
for pop in [50, 200, 500, 1000]:
    plt.scatter([], [], s=pop*3, c='gray', alpha=0.6, edgecolors='black',
               label=f'{pop}M people')
plt.legend(scatterpoints=1, frameon=True, labelspacing=2, 
          title='Population', loc='lower right')

plt.tight_layout()
plt.show()

### ðŸŽ¯ Practice Exercise 2: Scatter Plots

Create a scatter plot to analyze relationships in data:

In [None]:
# Exercise: Analyze student performance
# TODO: Create a scatter plot showing:
# 1. Study hours (x-axis) vs Exam scores (y-axis)
# 2. Color points by student group (A, B, C)
# 3. Add a trend line
# 4. Calculate and display correlation

np.random.seed(42)
n_students = 60

# Your code here:
# TODO: Generate study hours (2-10 hours)
# TODO: Generate correlated exam scores
# TODO: Assign student groups
# TODO: Create scatter plot with colors
# TODO: Add trend line and correlation


# Test your solution (uncomment when ready):
# plt.show()

## Part 3: Histograms - Understanding Distributions

### Basic Histogram

In [None]:
# Generate sample data
np.random.seed(42)
data = np.random.normal(100, 15, 1000)

# Create histogram
plt.figure(figsize=(12, 6))

# Plot histogram
n, bins, patches = plt.hist(data, bins=30, edgecolor='black', alpha=0.7, color='steelblue')

# Add statistics lines
mean_val = np.mean(data)
median_val = np.median(data)
std_val = np.std(data)

plt.axvline(mean_val, color='red', linestyle='--', linewidth=2, label=f'Mean: {mean_val:.1f}')
plt.axvline(median_val, color='green', linestyle='-.', linewidth=2, label=f'Median: {median_val:.1f}')
plt.axvline(mean_val - std_val, color='orange', linestyle=':', linewidth=1.5, label=f'Â±1 Std: {std_val:.1f}')
plt.axvline(mean_val + std_val, color='orange', linestyle=':', linewidth=1.5)

plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Distribution of Values with Statistical Markers')
plt.legend()
plt.grid(True, alpha=0.3, axis='y')

# Add text box with statistics
stats_text = f'Count: {len(data)}\nMean: {mean_val:.2f}\nStd: {std_val:.2f}\nMin: {np.min(data):.2f}\nMax: {np.max(data):.2f}'
plt.text(0.02, 0.98, stats_text, transform=plt.gca().transAxes,
         fontsize=10, verticalalignment='top',
         bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))

plt.show()

### Comparing Multiple Distributions

In [None]:
# Generate multiple distributions
np.random.seed(42)
normal_data = np.random.normal(100, 15, 1000)
skewed_data = np.random.gamma(2, 2, 1000) * 10 + 50
uniform_data = np.random.uniform(60, 140, 1000)
bimodal_data = np.concatenate([np.random.normal(80, 10, 500), 
                               np.random.normal(120, 10, 500)])

# Create subplots
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Plot different distributions
distributions = [
    (normal_data, 'Normal Distribution', 'green'),
    (skewed_data, 'Skewed Distribution (Gamma)', 'orange'),
    (uniform_data, 'Uniform Distribution', 'blue'),
    (bimodal_data, 'Bimodal Distribution', 'purple')
]

for ax, (data, title, color) in zip(axes.flat, distributions):
    # Plot histogram
    ax.hist(data, bins=30, edgecolor='black', alpha=0.7, color=color, density=True)
    
    # Overlay normal distribution for comparison
    from scipy import stats
    x = np.linspace(data.min(), data.max(), 100)
    ax.plot(x, stats.norm.pdf(x, data.mean(), data.std()), 
           'r-', linewidth=2, label='Normal fit')
    
    # Add mean line
    ax.axvline(data.mean(), color='red', linestyle='--', linewidth=2)
    
    ax.set_title(title)
    ax.set_xlabel('Value')
    ax.set_ylabel('Density')
    ax.legend()
    ax.grid(True, alpha=0.3)
    
    # Add skewness and kurtosis
    skew = stats.skew(data)
    kurt = stats.kurtosis(data)
    ax.text(0.02, 0.98, f'Skew: {skew:.2f}\nKurtosis: {kurt:.2f}',
           transform=ax.transAxes, fontsize=9, verticalalignment='top',
           bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))

plt.suptitle('Comparison of Different Distribution Types', fontsize=16)
plt.tight_layout()
plt.show()

### 2D Histograms and Hexbin Plots

In [None]:
# Generate 2D data
np.random.seed(42)
x = np.random.randn(5000)
y = 2 * x + np.random.randn(5000)

fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Regular scatter plot
axes[0].scatter(x, y, alpha=0.3, s=1)
axes[0].set_title('Scatter Plot (5000 points)')
axes[0].set_xlabel('X')
axes[0].set_ylabel('Y')
axes[0].grid(True, alpha=0.3)

# 2D histogram
h = axes[1].hist2d(x, y, bins=30, cmap='YlOrRd')
axes[1].set_title('2D Histogram')
axes[1].set_xlabel('X')
axes[1].set_ylabel('Y')
plt.colorbar(h[3], ax=axes[1], label='Count')

# Hexbin plot
hb = axes[2].hexbin(x, y, gridsize=20, cmap='YlGnBu')
axes[2].set_title('Hexbin Plot')
axes[2].set_xlabel('X')
axes[2].set_ylabel('Y')
plt.colorbar(hb, ax=axes[2], label='Count')

plt.tight_layout()
plt.show()

### ðŸŽ¯ Practice Exercise 3: Histograms

Analyze age distribution in a population:

In [None]:
# Exercise: Population Age Analysis
# TODO: Create histograms showing:
# 1. Overall age distribution
# 2. Comparison of male vs female age distributions
# 3. Age distribution by region (overlay or subplots)
# 4. Add statistical markers (mean, median)

np.random.seed(42)
n_people = 2000

# Your code here:
# TODO: Generate age data (mix of different distributions)
# TODO: Create gender labels
# TODO: Create region labels
# TODO: Plot histograms with comparisons


# Test your solution (uncomment when ready):
# plt.show()

## Part 4: Bar Charts - Categorical Comparisons

### Basic Bar Charts

In [None]:
# Sample data
categories = ['Product A', 'Product B', 'Product C', 'Product D', 'Product E']
values = [23, 45, 56, 78, 32]
errors = [2, 3, 4, 5, 3]  # Error bars

# Create figure with subplots
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

# Vertical bar chart
bars1 = axes[0].bar(categories, values, color='coral', edgecolor='black', 
                    alpha=0.7, yerr=errors, capsize=5)

# Add value labels on bars
for bar in bars1:
    height = bar.get_height()
    axes[0].text(bar.get_x() + bar.get_width()/2., height,
                f'{height}', ha='center', va='bottom')

axes[0].set_xlabel('Products')
axes[0].set_ylabel('Sales (in thousands)')
axes[0].set_title('Vertical Bar Chart with Error Bars')
axes[0].grid(True, alpha=0.3, axis='y')

# Horizontal bar chart
bars2 = axes[1].barh(categories, values, color='skyblue', edgecolor='black', alpha=0.7)

# Add value labels
for i, (bar, val) in enumerate(zip(bars2, values)):
    axes[1].text(val, bar.get_y() + bar.get_height()/2., 
                f'{val}k', ha='left', va='center')

axes[1].set_xlabel('Sales (in thousands)')
axes[1].set_ylabel('Products')
axes[1].set_title('Horizontal Bar Chart')
axes[1].grid(True, alpha=0.3, axis='x')

plt.tight_layout()
plt.show()

### Grouped and Stacked Bar Charts

In [None]:
# Data for multiple groups
categories = ['Q1', 'Q2', 'Q3', 'Q4']
product_a = [20, 35, 30, 35]
product_b = [25, 30, 35, 30]
product_c = [15, 20, 35, 25]

x = np.arange(len(categories))
width = 0.25

fig, axes = plt.subplots(1, 2, figsize=(14, 6))

# Grouped bar chart
bars1 = axes[0].bar(x - width, product_a, width, label='Product A', color='#FF6B6B', edgecolor='black')
bars2 = axes[0].bar(x, product_b, width, label='Product B', color='#4ECDC4', edgecolor='black')
bars3 = axes[0].bar(x + width, product_c, width, label='Product C', color='#45B7D1', edgecolor='black')

axes[0].set_xlabel('Quarter')
axes[0].set_ylabel('Sales (in thousands)')
axes[0].set_title('Grouped Bar Chart - Quarterly Sales by Product')
axes[0].set_xticks(x)
axes[0].set_xticklabels(categories)
axes[0].legend()
axes[0].grid(True, alpha=0.3, axis='y')

# Add value labels
for bars in [bars1, bars2, bars3]:
    for bar in bars:
        height = bar.get_height()
        axes[0].text(bar.get_x() + bar.get_width()/2., height,
                    f'{height}', ha='center', va='bottom', fontsize=9)

# Stacked bar chart
bars1 = axes[1].bar(categories, product_a, label='Product A', color='#FF6B6B', edgecolor='black')
bars2 = axes[1].bar(categories, product_b, bottom=product_a, label='Product B', color='#4ECDC4', edgecolor='black')
bars3 = axes[1].bar(categories, product_c, bottom=np.array(product_a)+np.array(product_b), 
                   label='Product C', color='#45B7D1', edgecolor='black')

axes[1].set_xlabel('Quarter')
axes[1].set_ylabel('Sales (in thousands)')
axes[1].set_title('Stacked Bar Chart - Total Quarterly Sales')
axes[1].legend()
axes[1].grid(True, alpha=0.3, axis='y')

# Add total labels on top
totals = np.array(product_a) + np.array(product_b) + np.array(product_c)
for i, (cat, total) in enumerate(zip(categories, totals)):
    axes[1].text(i, total + 1, f'Total: {total}', ha='center', fontweight='bold')

plt.tight_layout()
plt.show()

### Percentage Stacked Bar Chart

In [None]:
# Convert to percentages
totals = np.array(product_a) + np.array(product_b) + np.array(product_c)
product_a_pct = (np.array(product_a) / totals) * 100
product_b_pct = (np.array(product_b) / totals) * 100
product_c_pct = (np.array(product_c) / totals) * 100

# Create percentage stacked bar chart
fig, ax = plt.subplots(figsize=(10, 6))

bars1 = ax.bar(categories, product_a_pct, label='Product A', color='#FF6B6B', edgecolor='black')
bars2 = ax.bar(categories, product_b_pct, bottom=product_a_pct, 
              label='Product B', color='#4ECDC4', edgecolor='black')
bars3 = ax.bar(categories, product_c_pct, 
              bottom=product_a_pct + product_b_pct,
              label='Product C', color='#45B7D1', edgecolor='black')

# Add percentage labels
for i, cat in enumerate(categories):
    # Product A
    if product_a_pct[i] > 5:  # Only show if segment is large enough
        ax.text(i, product_a_pct[i]/2, f'{product_a_pct[i]:.1f}%', 
               ha='center', va='center', fontweight='bold')
    # Product B
    if product_b_pct[i] > 5:
        ax.text(i, product_a_pct[i] + product_b_pct[i]/2, f'{product_b_pct[i]:.1f}%',
               ha='center', va='center', fontweight='bold')
    # Product C
    if product_c_pct[i] > 5:
        ax.text(i, product_a_pct[i] + product_b_pct[i] + product_c_pct[i]/2, 
               f'{product_c_pct[i]:.1f}%', ha='center', va='center', fontweight='bold')

ax.set_ylabel('Percentage (%)')
ax.set_title('100% Stacked Bar Chart - Market Share by Quarter')
ax.legend(loc='upper left', bbox_to_anchor=(1, 1))
ax.set_ylim(0, 100)
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

### ðŸŽ¯ Practice Exercise 4: Bar Charts

Create a comprehensive sales dashboard:

In [None]:
# Exercise: Sales Dashboard
# TODO: Create bar charts showing:
# 1. Monthly sales for different regions
# 2. Top 5 products by revenue
# 3. Year-over-year comparison
# 4. Customer satisfaction ratings

# Sample data structure
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
regions = ['North', 'South', 'East', 'West']

# Your code here:
# TODO: Generate sales data for each region and month
# TODO: Create product revenue data
# TODO: Create year-over-year comparison data
# TODO: Create satisfaction ratings data
# TODO: Create a 2x2 subplot layout with different bar charts


# Test your solution (uncomment when ready):
# plt.show()

## Part 5: Multiple Subplots - Creating Dashboards

### Basic Subplot Layouts

In [None]:
# Generate sample data
np.random.seed(42)
x = np.linspace(0, 10, 100)
y = np.sin(x)
data = np.random.normal(100, 15, 1000)
categories = ['A', 'B', 'C', 'D', 'E']
values = [23, 45, 56, 78, 32]

# Create 2x2 subplot layout
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Subplot 1: Line plot
axes[0, 0].plot(x, y, 'b-', linewidth=2)
axes[0, 0].plot(x, np.cos(x), 'r--', linewidth=2)
axes[0, 0].set_title('Line Plot')
axes[0, 0].set_xlabel('X')
axes[0, 0].set_ylabel('Y')
axes[0, 0].grid(True, alpha=0.3)
axes[0, 0].legend(['sin(x)', 'cos(x)'])

# Subplot 2: Scatter plot
x_scatter = np.random.randn(100)
y_scatter = 2 * x_scatter + np.random.randn(100) * 0.5
axes[0, 1].scatter(x_scatter, y_scatter, alpha=0.6, c=x_scatter, cmap='viridis')
axes[0, 1].set_title('Scatter Plot')
axes[0, 1].set_xlabel('X')
axes[0, 1].set_ylabel('Y')
axes[0, 1].grid(True, alpha=0.3)

# Subplot 3: Histogram
axes[1, 0].hist(data, bins=30, color='green', alpha=0.7, edgecolor='black')
axes[1, 0].axvline(data.mean(), color='red', linestyle='--', linewidth=2)
axes[1, 0].set_title('Histogram')
axes[1, 0].set_xlabel('Value')
axes[1, 0].set_ylabel('Frequency')
axes[1, 0].grid(True, alpha=0.3, axis='y')

# Subplot 4: Bar chart
bars = axes[1, 1].bar(categories, values, color='orange', edgecolor='black', alpha=0.7)
axes[1, 1].set_title('Bar Chart')
axes[1, 1].set_xlabel('Category')
axes[1, 1].set_ylabel('Value')
axes[1, 1].grid(True, alpha=0.3, axis='y')

# Add value labels on bars
for bar in bars:
    height = bar.get_height()
    axes[1, 1].text(bar.get_x() + bar.get_width()/2., height,
                   f'{height}', ha='center', va='bottom')

# Overall title
plt.suptitle('Data Visualization Dashboard', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

### Complex Subplot Layouts with GridSpec

In [None]:
import matplotlib.gridspec as gridspec

# Create figure
fig = plt.figure(figsize=(15, 10))
gs = gridspec.GridSpec(3, 3, figure=fig, hspace=0.3, wspace=0.3)

# Large plot spanning 2x2
ax1 = fig.add_subplot(gs[0:2, 0:2])
x = np.linspace(0, 10, 100)
ax1.plot(x, np.sin(x), label='sin(x)')
ax1.plot(x, np.cos(x), label='cos(x)')
ax1.plot(x, np.sin(x) * np.cos(x), label='sin(x)cos(x)')
ax1.set_title('Main Plot - Trigonometric Functions')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Top right plot
ax2 = fig.add_subplot(gs[0, 2])
ax2.bar(['A', 'B', 'C'], [30, 50, 40], color=['red', 'green', 'blue'])
ax2.set_title('Bar Chart')
ax2.set_ylabel('Value')

# Middle right plot
ax3 = fig.add_subplot(gs[1, 2])
data = np.random.randn(100)
ax3.hist(data, bins=20, color='purple', alpha=0.7)
ax3.set_title('Distribution')
ax3.set_xlabel('Value')

# Bottom plots
ax4 = fig.add_subplot(gs[2, 0])
ax4.pie([30, 25, 20, 25], labels=['Q1', 'Q2', 'Q3', 'Q4'], autopct='%1.1f%%')
ax4.set_title('Quarterly Distribution')

ax5 = fig.add_subplot(gs[2, 1])
x = np.random.randn(50)
y = x + np.random.randn(50) * 0.5
ax5.scatter(x, y, alpha=0.6)
ax5.set_title('Scatter Plot')
ax5.set_xlabel('X')
ax5.set_ylabel('Y')

ax6 = fig.add_subplot(gs[2, 2])
ax6.boxplot([np.random.normal(100, 10, 100),
            np.random.normal(110, 15, 100),
            np.random.normal(90, 20, 100)],
           labels=['Group A', 'Group B', 'Group C'])
ax6.set_title('Box Plot Comparison')
ax6.set_ylabel('Value')

plt.suptitle('Complex Dashboard Layout with GridSpec', fontsize=16, fontweight='bold')
plt.show()

## Part 6: Customization and Styling

### Color Maps and Styles

In [None]:
# Available styles
print("Available styles:")
print(plt.style.available[:10])  # Show first 10 styles

# Create plots with different styles
styles = ['default', 'seaborn', 'ggplot', 'bmh']
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

for ax, style in zip(axes.flat, styles):
    with plt.style.context(style):
        x = np.linspace(0, 10, 100)
        ax.plot(x, np.sin(x), label='sin(x)')
        ax.plot(x, np.cos(x), label='cos(x)')
        ax.plot(x, np.sin(x) * 0.5, label='0.5*sin(x)')
        ax.set_title(f'Style: {style}')
        ax.legend()
        ax.grid(True)

plt.suptitle('Comparison of Different Matplotlib Styles', fontsize=14)
plt.tight_layout()
plt.show()

### Custom Colors and Colormaps

In [None]:
# Demonstrate different colormaps
fig, axes = plt.subplots(2, 3, figsize=(15, 10))

# Generate data
x = np.linspace(0, 10, 20)
y = np.linspace(0, 10, 20)
X, Y = np.meshgrid(x, y)
Z = np.sin(X) * np.cos(Y)

# Different colormaps
cmaps = ['viridis', 'plasma', 'coolwarm', 'RdYlBu', 'rainbow', 'twilight']

for ax, cmap in zip(axes.flat, cmaps):
    im = ax.contourf(X, Y, Z, levels=20, cmap=cmap)
    ax.set_title(f'Colormap: {cmap}')
    plt.colorbar(im, ax=ax)

plt.suptitle('Different Colormaps for 2D Data', fontsize=14)
plt.tight_layout()
plt.show()

### Annotations and Text

In [None]:
# Create figure with annotations
fig, ax = plt.subplots(figsize=(12, 8))

# Generate data
x = np.linspace(0, 10, 100)
y = np.sin(x) * np.exp(-x/10)

# Plot
ax.plot(x, y, 'b-', linewidth=2, label='Damped Sine Wave')

# Find maximum
max_idx = np.argmax(y)
max_x, max_y = x[max_idx], y[max_idx]

# Annotations
ax.annotate('Maximum', xy=(max_x, max_y), xytext=(max_x+2, max_y+0.2),
            arrowprops=dict(arrowstyle='->', color='red', lw=2),
            fontsize=12, color='red', fontweight='bold')

# Text box
textstr = f'Maximum at x={max_x:.2f}\ny={max_y:.3f}'
props = dict(boxstyle='round', facecolor='wheat', alpha=0.8)
ax.text(0.05, 0.95, textstr, transform=ax.transAxes, fontsize=11,
        verticalalignment='top', bbox=props)

# Shaded region
ax.fill_between(x, 0, y, where=(x > 2) & (x < 4), 
                alpha=0.3, color='green', label='Area of Interest')

# Mathematical expression
ax.text(6, 0.3, r'$y = \sin(x) \cdot e^{-x/10}$', fontsize=14,
        bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.5))

# Horizontal and vertical lines
ax.axhline(y=0, color='k', linewidth=0.5)
ax.axvline(x=max_x, color='red', linewidth=1, linestyle='--', alpha=0.5)

ax.set_xlabel('Time (s)', fontsize=12)
ax.set_ylabel('Amplitude', fontsize=12)
ax.set_title('Annotated Plot with Text and Shading', fontsize=14, fontweight='bold')
ax.legend(loc='upper right')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Part 7: Integration with Pandas

### Plotting DataFrames

In [None]:
# Create sample DataFrame
dates = pd.date_range('2024-01-01', periods=100)
df = pd.DataFrame({
    'Date': dates,
    'Sales': np.random.randn(100).cumsum() + 100,
    'Costs': np.random.randn(100).cumsum() + 80,
    'Profit': np.random.randn(100).cumsum() + 20
})
df.set_index('Date', inplace=True)

# Create comprehensive visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Line plot
df.plot(ax=axes[0, 0], linewidth=2)
axes[0, 0].set_title('Time Series Data')
axes[0, 0].set_ylabel('Value')
axes[0, 0].grid(True, alpha=0.3)

# Area plot
df.plot.area(ax=axes[0, 1], alpha=0.6)
axes[0, 1].set_title('Stacked Area Plot')
axes[0, 1].set_ylabel('Value')

# Box plot
df.plot.box(ax=axes[1, 0])
axes[1, 0].set_title('Distribution Comparison')
axes[1, 0].set_ylabel('Value')

# Correlation heatmap
corr = df.corr()
im = axes[1, 1].imshow(corr, cmap='coolwarm', aspect='auto', vmin=-1, vmax=1)
axes[1, 1].set_xticks(range(len(corr.columns)))
axes[1, 1].set_yticks(range(len(corr.columns)))
axes[1, 1].set_xticklabels(corr.columns)
axes[1, 1].set_yticklabels(corr.columns)
axes[1, 1].set_title('Correlation Heatmap')

# Add correlation values
for i in range(len(corr.columns)):
    for j in range(len(corr.columns)):
        axes[1, 1].text(j, i, f'{corr.iloc[i, j]:.2f}',
                       ha='center', va='center', color='white' if abs(corr.iloc[i, j]) > 0.5 else 'black')

plt.colorbar(im, ax=axes[1, 1])
plt.suptitle('Pandas DataFrame Visualization', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

## Summary and Next Steps

### What We've Learned

Congratulations! You've mastered the fundamentals of data visualization with Matplotlib:

1. **Basic Plotting**: Creating line plots, understanding plot anatomy
2. **Scatter Plots**: Visualizing relationships, correlations, and patterns
3. **Histograms**: Understanding data distributions and statistics
4. **Bar Charts**: Comparing categorical data with various layouts
5. **Multiple Subplots**: Creating comprehensive dashboards
6. **Customization**: Styling, colors, annotations, and text
7. **Integration**: Working with Pandas DataFrames

### Key Takeaways

- **Choose the right plot**: Different visualizations serve different purposes
- **Less is more**: Avoid cluttering plots with unnecessary elements
- **Color matters**: Use color purposefully to highlight important information
- **Label everything**: Always include titles, axis labels, and legends
- **Consider your audience**: Tailor visualizations to your viewers

### Advanced Topics to Explore

1. **3D Plotting**: Using `mpl_toolkits.mplot3d`
2. **Animations**: Creating dynamic visualizations
3. **Interactive Plots**: Using widgets and event handling
4. **Seaborn**: Statistical plotting built on Matplotlib
5. **Plotly**: Interactive web-based visualizations

### Final Challenge

Create a comprehensive data analysis dashboard that:
1. Loads a dataset (real or generated)
2. Performs exploratory data analysis
3. Creates at least 6 different plot types
4. Uses custom styling and annotations
5. Tells a story with the data

In [None]:
# Final Challenge: Comprehensive Data Dashboard
# Your implementation here:

def create_analysis_dashboard(data=None):
    """
    Create a comprehensive analysis dashboard.
    
    Requirements:
    1. Generate or load dataset
    2. Create multiple visualizations
    3. Include statistical analysis
    4. Add annotations and insights
    5. Use professional styling
    """
    # TODO: Implement your dashboard
    pass

# Uncomment to test:
# create_analysis_dashboard()

---

## Resources for Further Learning

- **Official Matplotlib Documentation**: https://matplotlib.org/stable/contents.html
- **Matplotlib Gallery**: https://matplotlib.org/stable/gallery/index.html
- **Matplotlib Tutorials**: https://matplotlib.org/stable/tutorials/index.html
- **Python Graph Gallery**: https://python-graph-gallery.com/
- **Seaborn (Statistical Plotting)**: https://seaborn.pydata.org/
- **Plotly (Interactive Plots)**: https://plotly.com/python/

Remember: Visualization is both an art and a science. The more you practice, the better you'll become at creating meaningful visual insights!

Happy Plotting! ðŸ“ŠðŸ“ˆðŸ“‰