Automated data visualization for EDA, model performance, and business reporting. Activates for "visualize data", "create plots", "EDA", "exploratory analysis", "confusion matrix", "ROC curve", "feature distribution", "correlation heatmap", "plot results", "dashboard". Generates publication-quality visualizations integrated with SpecWeave increments.
Inherits all available tools
Additional assets for this skill
This skill inherits all available tools. When active, it can use any tool Claude has access to.
Automated visualization generation for exploratory data analysis, model performance reporting, and stakeholder communication. Creates publication-quality plots, interactive dashboards, and business-friendly reports—all integrated with SpecWeave's increment workflow.
Automated EDA Report:
from specweave import EDAVisualizer
visualizer = EDAVisualizer(increment="0042")
# Generates comprehensive EDA report
report = visualizer.generate_eda_report(df)
# Creates:
# - Dataset overview (rows, columns, memory, missing values)
# - Numerical feature distributions (histograms + KDE)
# - Categorical feature counts (bar charts)
# - Correlation heatmap
# - Missing value pattern
# - Outlier detection plots
# - Feature relationships (pairplot for top features)
Individual EDA Plots:
# Distribution plots
visualizer.plot_distribution(
data=df['age'],
title="Age Distribution",
bins=30
)
# Correlation heatmap
visualizer.plot_correlation_heatmap(
data=df[numerical_columns],
method='pearson' # or 'spearman', 'kendall'
)
# Missing value patterns
visualizer.plot_missing_values(df)
# Outlier detection (boxplots)
visualizer.plot_outliers(df[numerical_columns])
Classification Performance:
from specweave import ClassificationVisualizer
viz = ClassificationVisualizer(increment="0042")
# Confusion matrix
viz.plot_confusion_matrix(
y_true=y_test,
y_pred=y_pred,
classes=['Negative', 'Positive']
)
# ROC curve
viz.plot_roc_curve(
y_true=y_test,
y_proba=y_proba
)
# Precision-Recall curve
viz.plot_precision_recall_curve(
y_true=y_test,
y_proba=y_proba
)
# Learning curves (train vs val)
viz.plot_learning_curve(
train_scores=train_scores,
val_scores=val_scores
)
# Calibration curve (are probabilities well-calibrated?)
viz.plot_calibration_curve(
y_true=y_test,
y_proba=y_proba
)
Regression Performance:
from specweave import RegressionVisualizer
viz = RegressionVisualizer(increment="0042")
# Predicted vs Actual
viz.plot_predictions(
y_true=y_test,
y_pred=y_pred
)
# Residual plot
viz.plot_residuals(
y_true=y_test,
y_pred=y_pred
)
# Residual distribution (should be normal)
viz.plot_residual_distribution(
residuals=y_test - y_pred
)
# Error by feature value
viz.plot_error_analysis(
y_true=y_test,
y_pred=y_pred,
features=X_test
)
Feature Importance:
from specweave import FeatureVisualizer
viz = FeatureVisualizer(increment="0042")
# Feature importance (bar chart)
viz.plot_feature_importance(
feature_names=feature_names,
importances=model.feature_importances_,
top_n=20
)
# SHAP summary plot
viz.plot_shap_summary(
shap_values=shap_values,
features=X_test
)
# Partial dependence plots
viz.plot_partial_dependence(
model=model,
features=['age', 'income'],
X=X_train
)
# Feature interaction
viz.plot_feature_interaction(
model=model,
features=('age', 'income'),
X=X_train
)
Time Series Plots:
from specweave import TimeSeriesVisualizer
viz = TimeSeriesVisualizer(increment="0042")
# Time series with trend
viz.plot_timeseries(
data=sales_data,
show_trend=True
)
# Seasonal decomposition
viz.plot_seasonal_decomposition(
data=sales_data,
period=12 # Monthly seasonality
)
# Autocorrelation (ACF, PACF)
viz.plot_autocorrelation(data=sales_data)
# Forecast with confidence intervals
viz.plot_forecast(
actual=test_data,
forecast=forecast,
confidence_intervals=(0.80, 0.95)
)
Compare Multiple Models:
from specweave import ModelComparisonVisualizer
viz = ModelComparisonVisualizer(increment="0042")
# Compare metrics across models
viz.plot_model_comparison(
models=['Baseline', 'XGBoost', 'LightGBM', 'Neural Net'],
metrics={
'accuracy': [0.65, 0.87, 0.86, 0.85],
'roc_auc': [0.70, 0.92, 0.91, 0.90],
'training_time': [1, 45, 32, 320]
}
)
# ROC curves for multiple models
viz.plot_roc_curves_comparison(
models_predictions={
'XGBoost': (y_test, y_proba_xgb),
'LightGBM': (y_test, y_proba_lgbm),
'Neural Net': (y_test, y_proba_nn)
}
)
Plotly Integration:
from specweave import InteractiveVisualizer
viz = InteractiveVisualizer(increment="0042")
# Interactive scatter plot (zoom, pan, hover)
viz.plot_interactive_scatter(
x=X_test[:, 0],
y=X_test[:, 1],
colors=y_pred,
hover_data=df[['id', 'amount', 'merchant']]
)
# Interactive confusion matrix (click for details)
viz.plot_interactive_confusion_matrix(
y_true=y_test,
y_pred=y_pred
)
# Interactive feature importance (sortable, filterable)
viz.plot_interactive_feature_importance(
feature_names=feature_names,
importances=importances
)
Automated ML Report:
from specweave import MLReportGenerator
generator = MLReportGenerator(increment="0042")
# Generate executive summary report
report = generator.generate_report(
model=model,
test_data=(X_test, y_test),
business_metrics={
'false_positive_cost': 5,
'false_negative_cost': 500
}
)
# Creates:
# - Executive summary (1 page, non-technical)
# - Key metrics (accuracy, precision, recall)
# - Business impact ($$ saved, ROI)
# - Model performance visualizations
# - Recommendations
# - Technical appendix
Report Output (HTML/PDF):
# Fraud Detection Model - Executive Summary
## Key Results
- **Accuracy**: 87% (target: >85%) ✅
- **Fraud Detection Rate**: 62% (catching 310 frauds/day)
- **False Positive Rate**: 38% (190 false alarms/day)
## Business Impact
- **Fraud Prevented**: $155,000/day
- **Review Cost**: $950/day (190 transactions × $5)
- **Net Benefit**: $154,050/day ✅
- **Annual Savings**: $56.2M
## Model Performance
[Confusion Matrix Visualization]
[ROC Curve]
[Feature Importance]
## Recommendations
1. ✅ Deploy to production immediately
2. Monitor fraud patterns weekly
3. Retrain model monthly with new data
Real-Time Dashboard:
from specweave import DashboardCreator
creator = DashboardCreator(increment="0042")
# Create Grafana/Plotly dashboard
dashboard = creator.create_dashboard(
title="Model Performance Dashboard",
panels=[
{'type': 'metric', 'query': 'prediction_latency_p95'},
{'type': 'metric', 'query': 'predictions_per_second'},
{'type': 'timeseries', 'query': 'accuracy_over_time'},
{'type': 'timeseries', 'query': 'error_rate'},
{'type': 'heatmap', 'query': 'prediction_distribution'},
{'type': 'table', 'query': 'recent_anomalies'}
]
)
# Exports to Grafana JSON or Plotly Dash app
dashboard.export(format='grafana')
# Set consistent styling
visualizer.set_style(
style='seaborn', # Or 'ggplot', 'fivethirtyeight'
context='paper', # Or 'notebook', 'talk', 'poster'
palette='colorblind' # Accessible colors
)
# High-resolution exports
visualizer.save_figure(
filename='model_performance.png',
dpi=300, # Publication quality
bbox_inches='tight'
)
# Colorblind-friendly palettes
visualizer.use_colorblind_palette()
# Add alt text for accessibility
visualizer.add_alt_text(
plot=fig,
description="Confusion matrix showing 87% accuracy"
)
# High contrast for presentations
visualizer.set_high_contrast_mode()
# Add reference lines
viz.add_reference_line(
y=0.85, # Target accuracy
label='Target',
color='red',
linestyle='--'
)
# Add annotations
viz.annotate_point(
x=optimal_threshold,
y=optimal_f1,
text='Optimal threshold: 0.47'
)
# All visualizations auto-saved to increment folder
visualizer = EDAVisualizer(increment="0042")
# Creates:
# .specweave/increments/0042-fraud-detection/
# ├── visualizations/
# │ ├── eda/
# │ │ ├── distributions.png
# │ │ ├── correlation_heatmap.png
# │ │ └── missing_values.png
# │ ├── model_performance/
# │ │ ├── confusion_matrix.png
# │ │ ├── roc_curve.png
# │ │ ├── precision_recall.png
# │ │ └── learning_curves.png
# │ ├── feature_analysis/
# │ │ ├── feature_importance.png
# │ │ ├── shap_summary.png
# │ │ └── partial_dependence/
# │ └── reports/
# │ ├── executive_summary.html
# │ └── technical_report.pdf
/sw:sync-docs update
Updates:
<!-- .specweave/docs/internal/architecture/ml-model-performance.md -->
## Fraud Detection Model Performance (Increment 0042)
### Model Accuracy

### Key Metrics
- Accuracy: 87%
- Precision: 85%
- Recall: 62%
- ROC AUC: 0.92
### Feature Importance

Top 5 features:
1. amount_vs_user_average (0.18)
2. days_since_last_purchase (0.12)
3. merchant_risk_score (0.10)
4. velocity_24h (0.08)
5. location_distance_from_home (0.07)
# Generate EDA report
/ml:visualize-eda 0042
# Generate model performance report
/ml:visualize-performance 0042
# Create interactive dashboard
/ml:create-dashboard 0042
# Export all visualizations
/ml:export-visualizations 0042 --format png,pdf,html
# Generate full increment report with all visualizations
generator = IncrementReportGenerator(increment="0042")
report = generator.generate_full_report()
# Includes:
# - EDA visualizations
# - Experiment comparisons
# - Best model performance
# - Feature importance
# - Business impact
# - Deployment readiness
# Create reusable templates
template = VisualizationTemplate(name="fraud_analysis")
template.add_panel("confusion_matrix")
template.add_panel("roc_curve")
template.add_panel("top_fraud_features")
template.add_panel("fraud_trends_over_time")
# Apply to any increment
template.apply(increment="0042")
# Track visualization changes across model versions
viz_tracker = VisualizationTracker(increment="0042")
# Compare model v1 vs v2 visualizations
viz_tracker.compare_versions(
version_1="model-v1",
version_2="model-v2"
)
# Shows: Confusion matrix improved, ROC curve comparison, etc.
Data visualization is critical for:
This skill automates visualization generation, ensuring all ML work is visual, accessible, and business-friendly within SpecWeave's increment workflow.