Community Plugin

llama-cpp

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10x speedup vs PyTorch on CPU.

1.0.0

Updated 25 days ago

Capabilities

Commands

Agents

Skills

Hooks

MCP Servers

Install

Add the repository(one-time)

/plugin marketplace add zechenzhangAGI/AI-research-SKILLs

Install the plugin

/plugin install llama-cpp@zechenzhangAGI/AI-research-SKILLs

Component Details

No components detected in this plugin's metadata.

Stats

Stars00123456789

MaintenanceGood

Last Commit25 days ago

Links

View on GitHub

View README

Plugin Marketplace JSON

Similar Plugins

learning-output-style

Interactive learning mode that requests meaningful code contributions at decision points (mimics the unshipped Learning output style)

45.7K

code-review

Automated code review for pull requests using multiple specialized agents with confidence-based scoring

llama-cpp

Similar Plugins

learning-output-style

code-review

feature-dev

pr-review-toolkit