sankey-diagram-creator

安装量: 44
排名: #16737

安装

npx skills add https://github.com/dkyazzentwatwa/chatgpt-skills --skill sankey-diagram-creator

Create interactive Sankey diagrams to visualize flows, transfers, and relationships between categories. Perfect for budget flows, energy transfers, user journeys, and data pipelines.

Quick Start

from scripts.sankey_creator import SankeyCreator

# From dictionary
sankey = SankeyCreator()
sankey.from_dict({
    'source': ['A', 'A', 'B', 'B'],
    'target': ['C', 'D', 'C', 'D'],
    'value': [10, 20, 15, 25]
})
sankey.generate().save_html("flow.html")

# From CSV
sankey = SankeyCreator()
sankey.from_csv("flows.csv", source="from", target="to", value="amount")
sankey.title("Budget Flow").generate().save_html("budget.html")

Features

  • Multiple Input Sources: Dict, DataFrame, or CSV files

  • Interactive Output: Hover tooltips with flow details

  • Node Customization: Colors, labels, positions

  • Link Styling: Colors, opacity, labels

  • Export Formats: HTML (interactive), PNG, SVG

  • Auto-Coloring: Automatic color assignment or custom schemes

API Reference

Initialization

sankey = SankeyCreator()

Data Input Methods

# From dictionary
sankey.from_dict({
    'source': ['A', 'A', 'B'],
    'target': ['X', 'Y', 'X'],
    'value': [10, 20, 15]
})

# From CSV file
sankey.from_csv(
    filepath="data.csv",
    source="source_col",
    target="target_col",
    value="value_col",
    label=None  # Optional: column for link labels
)

# From pandas DataFrame
import pandas as pd
df = pd.DataFrame({
    'from': ['A', 'B'],
    'to': ['C', 'C'],
    'amount': [100, 200]
})
sankey.from_dataframe(df, source="from", target="to", value="amount")

# Add individual flows
sankey.add_flow("Source", "Target", 50)
sankey.add_flow("Source", "Other", 30, label="30 units")

Styling Methods

All methods return self for chaining.

# Node colors
sankey.node_colors({
    'Income': '#2ecc71',
    'Expenses': '#e74c3c',
    'Savings': '#3498db'
})

# Use colormap for automatic colors
sankey.node_colors(colormap='Pastel1')

# Link colors (by source, target, or custom)
sankey.link_colors(by='source')  # Color by source node
sankey.link_colors(by='target')  # Color by target node
sankey.link_colors(opacity=0.6)  # Set link opacity

# Custom link colors
sankey.link_colors(colors={
    ('Income', 'Expenses'): '#ff6b6b',
    ('Income', 'Savings'): '#4ecdc4'
})

# Title
sankey.title("My Flow Diagram")
sankey.title("Budget Overview", font_size=20)

# Layout
sankey.layout(orientation='h')  # Horizontal (default)
sankey.layout(orientation='v')  # Vertical
sankey.layout(pad=20)  # Node padding

Generation and Export

# Generate the diagram
sankey.generate()

# Save as interactive HTML
sankey.save_html("diagram.html")
sankey.save_html("diagram.html", auto_open=True)  # Open in browser

# Save as static image
sankey.save_image("diagram.png")
sankey.save_image("diagram.svg", format='svg')
sankey.save_image("diagram.png", width=1200, height=800)

# Get Plotly figure object for customization
fig = sankey.get_figure()
fig.update_layout(...)

# Show in notebook/browser
sankey.show()

Data Format

CSV Format

source,target,value,label
Income,Housing,1500,Rent
Income,Food,600,Groceries
Income,Transport,400,Car
Income,Savings,500,Emergency Fund
Savings,Investments,300,Stocks

Dictionary Format

data = {
    'source': ['Income', 'Income', 'Income', 'Expenses'],
    'target': ['Rent', 'Food', 'Savings', 'Tax'],
    'value': [1500, 600, 500, 400],
    'label': ['Housing', 'Groceries', 'Emergency', 'Federal']  # Optional
}

DataFrame Format

import pandas as pd
df = pd.DataFrame({
    'from_node': ['A', 'A', 'B', 'C'],
    'to_node': ['B', 'C', 'D', 'D'],
    'flow_value': [100, 200, 150, 180]
})

Color Schemes

Available Colormaps

| Pastel1 | Soft pastel colors

| Pastel2 | Muted pastels

| Set1 | Bold distinct colors

| Set2 | Muted distinct colors

| Set3 | Light distinct colors

| Paired | Paired color scheme

| viridis | Blue-green-yellow

| plasma | Purple-orange-yellow

Custom Colors

# Hex colors
sankey.node_colors({
    'Category A': '#1abc9c',
    'Category B': '#3498db',
    'Category C': '#9b59b6'
})

# Named colors
sankey.node_colors({
    'Revenue': 'green',
    'Costs': 'red',
    'Profit': 'blue'
})

CLI Usage

# Basic usage
python sankey_creator.py --input flows.csv \
    --source from --target to --value amount \
    --output flow.html

# With title and colors
python sankey_creator.py --input budget.csv \
    --source category --target subcategory --value amount \
    --title "Budget Breakdown" \
    --colormap Pastel1 \
    --output budget.html

# PNG output
python sankey_creator.py --input data.csv \
    --source src --target dst --value val \
    --output diagram.png \
    --width 1200 --height 800

CLI Arguments

| --input | Input CSV file | Required

| --source | Source column name | Required

| --target | Target column name | Required

| --value | Value column name | Required

| --output | Output file path | sankey.html

| --title | Diagram title | None

| --colormap | Color scheme | Pastel1

| --link-opacity | Link opacity (0-1) | 0.5

| --orientation | 'h' or 'v' | h

| --width | Image width (PNG/SVG) | 1000

| --height | Image height (PNG/SVG) | 600

Examples

Budget Flow

sankey = SankeyCreator()
sankey.from_dict({
    'source': ['Salary', 'Salary', 'Salary', 'Salary', 'Savings', 'Investments'],
    'target': ['Housing', 'Food', 'Transport', 'Savings', 'Investments', 'Stocks'],
    'value': [2000, 800, 500, 1000, 700, 700]
})
sankey.title("Monthly Budget Flow")
sankey.node_colors({
    'Salary': '#27ae60',
    'Housing': '#e74c3c',
    'Food': '#f39c12',
    'Transport': '#3498db',
    'Savings': '#9b59b6',
    'Investments': '#1abc9c',
    'Stocks': '#34495e'
})
sankey.generate().save_html("budget_flow.html")

Energy Flow

sankey = SankeyCreator()
sankey.add_flow("Coal", "Electricity", 100)
sankey.add_flow("Gas", "Electricity", 80)
sankey.add_flow("Solar", "Electricity", 30)
sankey.add_flow("Electricity", "Industry", 120)
sankey.add_flow("Electricity", "Residential", 60)
sankey.add_flow("Electricity", "Commercial", 30)

sankey.title("Energy Flow")
sankey.node_colors(colormap='Set2')
sankey.link_colors(by='source', opacity=0.4)
sankey.generate().save_html("energy.html")

User Journey

sankey = SankeyCreator()
sankey.from_dict({
    'source': ['Landing', 'Landing', 'Product', 'Product', 'Cart', 'Cart'],
    'target': ['Product', 'Exit', 'Cart', 'Exit', 'Checkout', 'Exit'],
    'value': [1000, 200, 600, 200, 400, 100]
})
sankey.title("User Journey")
sankey.node_colors({
    'Landing': '#3498db',
    'Product': '#2ecc71',
    'Cart': '#f1c40f',
    'Checkout': '#27ae60',
    'Exit': '#e74c3c'
})
sankey.generate().save_html("journey.html")

Multi-Level Flow

# Three-level hierarchy
data = {
    'source': [
        # Level 1 -> Level 2
        'Revenue', 'Revenue', 'Revenue',
        # Level 2 -> Level 3
        'Products', 'Products', 'Services', 'Services'
    ],
    'target': [
        'Products', 'Services', 'Other',
        'Electronics', 'Software', 'Consulting', 'Support'
    ],
    'value': [500, 300, 50, 300, 200, 200, 100]
}

sankey = SankeyCreator()
sankey.from_dict(data)
sankey.title("Revenue Breakdown")
sankey.generate().save_html("revenue.html")

Dependencies

plotly>=5.15.0
pandas>=2.0.0
kaleido>=0.2.0

Limitations

  • Static image export (PNG/SVG) requires kaleido package

  • Very complex diagrams with many nodes may be hard to read

  • Node positions are auto-calculated (limited manual control)

  • Link colors can only be uniform or by source/target

返回排行榜