Community Plugin

speculative-decoding

Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6x speedup), reducing latency for real-time applications, or deploying models with limited compute. Covers draft models, tree-based attention, Jacobi iteration, parallel token generation.

1.0.0

Updated 25 days ago

Capabilities

Commands

Agents

Skills

Hooks

MCP Servers

Install

Add the repository(one-time)

/plugin marketplace add zechenzhangAGI/AI-research-SKILLs

Install the plugin

/plugin install speculative-decoding@zechenzhangAGI/AI-research-SKILLs

Component Details

No components detected in this plugin's metadata.

Stats

Stars00123456789

MaintenanceGood

Last Commit25 days ago

Links

View on GitHub

View README

Plugin Marketplace JSON

Similar Plugins

code-review

Automated code review for pull requests using multiple specialized agents with confidence-based scoring

46.0K

agent-sdk-dev

Claude Agent SDK Development Plugin

46.0K

speculative-decoding

Similar Plugins

code-review

agent-sdk-dev

ralph-wiggum

pr-review-toolkit