From golang-skills
Optimizes Go code by providing patterns for string conversion, container capacity, and pass-by-value. Use when investigating slow Go code or writing performance-critical sections.
How this skill is triggered — by the user, by Claude, or both
Slash command
/golang-skills:go-performanceThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
- `scripts/bench-compare.sh` - Run when comparing benchmark results, saving baselines, or producing JSON benchmark metadata.
scripts/bench-compare.sh - Run when comparing benchmark results, saving baselines, or producing JSON benchmark metadata.references/BENCHMARKS.md - Read when writing benchmarks, using benchstat, or profiling with pprof.references/STRING-OPTIMIZATION.md - Read when optimizing string conversion, concatenation, or byte/string boundaries.Performance-specific guidelines apply only to the hot path. Don't prematurely optimize—focus these patterns where they matter most.
When converting primitives to/from strings, strconv is faster than fmt:
s := strconv.Itoa(rand.Int()) // ~2x faster than fmt.Sprint()
| Approach | Speed | Allocations |
|---|---|---|
fmt.Sprint | 143 ns/op | 2 allocs/op |
strconv.Itoa | 64.2 ns/op | 1 allocs/op |
Convert a fixed string to []byte once outside the loop:
data := []byte("Hello world")
for b.Loop() { // Go 1.24+; use b.N loops only for older Go
w.Write(data) // ~7x faster than []byte("...") each iteration
}
Specify container capacity where possible to allocate memory up front. This minimizes subsequent allocations from copying and resizing as elements are added.
Provide capacity hints when initializing maps with make():
m := make(map[string]os.DirEntry, len(files))
Note: Unlike slices, map capacity hints do not guarantee complete preemptive allocation—they approximate the number of hashmap buckets required.
Provide capacity hints when initializing slices with make(), particularly when appending:
data := make([]int, 0, size)
Unlike maps, slice capacity is not a hint—the compiler allocates exactly that much memory. Subsequent append() operations incur zero allocations until capacity is reached.
| Approach | Time (100M iterations) |
|---|---|
| No capacity | 2.48s |
| With capacity | 0.21s |
The capacity version is ~12x faster due to zero reallocations during append.
Don't pass pointers as function arguments just to save a few bytes. If a function refers to its argument x only as *x throughout, then the argument shouldn't be a pointer.
func process(s string) { // not *string — strings are small fixed-size headers
fmt.Println(s)
}
Common pass-by-value types: string, io.Reader, small structs.
Exceptions:
Choose the right strategy based on complexity:
| Method | Best For |
|---|---|
+ | Few strings, simple concat |
fmt.Sprintf | Formatted output with mixed types |
strings.Builder | Loop/piecemeal construction |
strings.Join | Joining a slice |
| Backtick literal | Constant multi-line text |
Always measure before and after optimizing. Use Go's built-in benchmark framework and profiling tools.
go test -bench=. -benchmem -count=10 ./...
Validation: After applying optimizations, run
bash scripts/bench-compare.shto measure the actual impact. Only keep optimizations with measurable improvement.
| Pattern | Bad | Good | Improvement |
|---|---|---|---|
| Int to string | fmt.Sprint(n) | strconv.Itoa(n) | ~2x faster |
Repeated []byte | []byte("str") in loop | Convert once outside | ~7x faster |
| Map initialization | make(map[K]V) | make(map[K]V, size) | Fewer allocs |
| Slice initialization | make([]T, 0) | make([]T, 0, cap) | ~12x faster |
| Small fixed-size args | *string, *io.Reader | string, io.Reader | No indirection |
| Simple string join | s1 + " " + s2 | (already good) | Use + for few strings |
| Loop string build | Repeated += | strings.Builder | O(n) vs O(n²) |
make with capacity hints or initializing maps and slicesnpx claudepluginhub cxuu/golang-skills --plugin golang-skillsProvides Go performance optimization patterns for allocation reduction, CPU efficiency, memory layout, GC tuning, pooling, caching, and hot-path optimization. Use after profiling identifies a bottleneck or during performance code review.
Profiles and optimizes Go code for CPU hotspots, memory allocations, and concurrency using pprof, benchmarks, benchstat, and statistical verification.
Master Go 1.21+ with advanced concurrency, performance optimization, and production-ready microservices. Useful for building Go services, CLIs, or microservices.