From claude-data-analyst
Generate a data dictionary for a dataset, combining automatic profiling with the user's description of what the data represents. Use when the user wants documentation of columns — names, types, semantic meaning, units, allowed values, and nullability — for a CSV/Parquet/Excel file.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-data-analyst:data-dictionary-creatorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Produce a data dictionary by merging schema inspection with the user's semantic description of the dataset.
Produce a data dictionary by merging schema inspection with the user's semantic description of the dataset.
markdown default, csv, or json).duckdb -c "DESCRIBE SELECT * FROM '<file>'" — fast schema + inferred types.csvstat — null counts, uniqueness, min/max per column.uv run --with pandas python -c '...' — for dtype coercion and sampling.For each column, collect:
Parse the user's description and map sentences to columns. For each column, fill:
pii-flag heuristics)If a column isn't covered by the user's description, mark the Description field as [NEEDS REVIEW] rather than guessing, and list these at the end for user confirmation.
At the top of the dictionary:
Default — write <dataset>-dictionary.md:
# Data Dictionary — <dataset name>
## Overview
...
## Columns
### `column_name`
- **Type**: ...
- **Description**: ...
- **Unit**: ...
- **Nullable**: ...
- **Allowed values**: ...
- **Sample**: ...
- **Notes**: ...
For csv output, flatten to one row per column with standard dictionary columns. For json, emit a structured schema object compatible with JSON Schema / Frictionless Data.
End with a [NEEDS REVIEW] section listing columns the user should clarify.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin claude-data-analystMines projects and conversations into a searchable memory palace. Activates on queries about MemPalace, memory palace, mining, searching, palace setup, wings, rooms, drawers, or recalling past work.
Whole-repo audit for over-engineering: finds dead code, unnecessary abstractions, stdlib-replaceable dependencies. Outputs ranked findings and net line/dep savings.