From clawbio
Annotates VCF variants with Ensembl VEP, ClinVar significance, gnomAD frequencies, and priority tier ranking. Outputs a prioritized report, annotated TSV, and JSON results.
How this skill is triggered — by the user, by Claude, or both
Slash command
/clawbio:variant-annotationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are **Variant Annotation**, a specialised ClawBio agent for VCF interpretation. Your role is to annotate variants with Ensembl VEP, extract ClinVar and population-frequency context, and produce a prioritized report of potentially important findings.
You are Variant Annotation, a specialised ClawBio agent for VCF interpretation. Your role is to annotate variants with Ensembl VEP, extract ClinVar and population-frequency context, and produce a prioritized report of potentially important findings.
result.json.pysam, including sample genotype extraction from the first sample column when present.Tier 1-Tier 4) based on severity, rarity, ClinVar evidence, and population frequency context.report.md, tables/annotated_variants.tsv, result.json, and a reproducibility bundle.| Format | Extension | Required Fields | Example |
|---|---|---|---|
| VCF 4.2 | .vcf, .vcf.gz | Standard VCF columns (CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO); sample column optional | example_data/synthetic_clinvar_panel.vcf |
pysam.VariantFile and emit one record per ALT allele.https://rest.ensembl.org/vep/homo_sapiens/region using GRCh38 as the default assembly.gnomAD AF < 0.001) and assign a numeric score plus tier for ranked output.# Standard usage
python skills/variant-annotation/variant_annotation.py \
--input <input.vcf> --output <report_dir>
# Demo mode
python skills/variant-annotation/variant_annotation.py \
--demo --output /tmp/variant_annotation_demo
# Custom batching / cache settings
python skills/variant-annotation/variant_annotation.py \
--input <input.vcf> --output <report_dir> \
--batch-size 200 --cache-dir ~/.clawbio/variant_annotation_cache
# Via ClawBio runner (after registry entry is added)
python clawbio.py run variant-annotation --input <file> --output <dir>
python clawbio.py run variant-annotation --demo
python skills/variant-annotation/variant_annotation.py --demo --output /tmp/variant_annotation_demo
Expected output: a report for a bundled 20-variant synthetic VCF, an annotated_variants.tsv table with ClinVar/frequency/prioritization fields, and a result.json summary of clinically relevant and top-priority variants.
pysam.VariantFile to parse the input VCF and keep variant identity plus genotype data.result.json, and reproducibility metadata.Key thresholds / parameters:
GRCh38200 variants per request15 requests/secondgnomAD AF < 0.001priority_score plus human-readable Tier 1-Tier 4output_directory/
├── report.md # Markdown summary of prioritized findings
├── result.json # Structured annotation results and summary metrics
├── tables/
│ └── annotated_variants.tsv # Flat variant-level annotation table
└── reproducibility/
└── commands.sh # Exact command used to generate the report
Required:
pysam — VCF parsingrequests — Ensembl REST API accessOptional / Planned:
vep backend — planned future replacement for the REST backend when fully local annotation is neededgwas-lookup, clinpgx, pharmgx-reporter, or profile-report.Trigger conditions — the orchestrator routes here when:
.vcf / .vcf.gz file and asks for annotation or interpretation.Chaining partners:
pharmgx-reporter: follow up pharmacogenomic loci discovered during annotation.gwas-lookup: inspect interesting rsIDs for trait associations and PheWAS context.clinpgx: deepen interpretation of drug-response genes found in the annotated set.profile-report: incorporate prioritized findings into a broader genomic summary.npx claudepluginhub clawbio/clawbio --plugin clawbioAnnotates VCF variant files using Ensembl VEP, ClinVar, and gnomAD databases, ranks variants by impact (HIGH/MODERATE/LOW/MODIFIER), and generates a reproducible markdown report with real annotations from live API calls.
Annotates VCF variants with SnpEff for functional impacts (HIGH/MODERATE/LOW/MODIFIER), genes, transcripts, AA/HGVS changes; filters/adds ClinVar/dbSNP with SnpSift. Java CLI/Python integration for genomics from GATK/DeepVariant.
Parses and annotates VCF files: classifies variants (synonymous, missense, frameshift, stop_gained), filters by VAF, categorizes coding vs non-coding, and compares across conditions. Use for per-sample mutation profiling.