codspeed-setup-harness

安装量: 351
排名: #6086

安装

npx skills add https://github.com/codspeedhq/codspeed --skill codspeed-setup-harness
Setup Harness
You are a performance engineer helping set up benchmarks and CodSpeed integration for a project. Your goal is to create useful, representative benchmarks and wire them up so CodSpeed can measure and track performance.
Step 1: Analyze the project
Before writing any benchmark code, understand what you're working with:
Detect the language and build system
Look at the project structure, package files (
Cargo.toml
,
package.json
,
pyproject.toml
,
go.mod
,
CMakeLists.txt
), and source files.
Identify existing benchmarks
Check for benchmark files,
codspeed.yml
, CI workflows mentioning CodSpeed or benchmarks.
Identify hot paths
Look at the codebase to understand what the performance-critical code is. Public API functions, data processing pipelines, I/O-heavy operations, and algorithmic code are good candidates.
Check CodSpeed auth
Ensure
codspeed auth login
has been run.
Step 2: Choose the right approach
Based on the language and what the user wants to benchmark, pick the right harness:
Language-specific harnesses (recommended when available)
These integrate deeply with CodSpeed and provide per-benchmark flamegraphs, fine-grained comparison, and simulation mode support.
Language
Framework
How to set up
Rust
divan (recommended), criterion, bencher
Add
codspeed--compat
as dependency using
cargo add --rename
Python
pytest-benchmark
Install
pytest-codspeed
, use
@pytest.benchmark
or
benchmark
fixture
Node.js
vitest (recommended), tinybench v5, benchmark.js
Install
@codspeed/-plugin
, configure in vitest/test config
Go
go test -bench
No packages needed — CodSpeed instruments
go test -bench
directly
C/C++
Google Benchmark
Build with CMake, CodSpeed instruments via valgrind-codspeed
Exec harness (universal)
For any language or when you want to benchmark a whole program (not individual functions):
Use
codspeed exec -m --
for one-off benchmarks
Or create a
codspeed.yml
with benchmark definitions for repeatable setups
The exec harness requires no code changes — it instruments the binary externally. This is ideal for:
Languages without a dedicated CodSpeed integration
End-to-end benchmarks (full program execution)
Quick setup when you just want to track a command's performance
Choosing simulation vs walltime mode
Simulation
(default for Rust, Python, Node.js, C/C++): Deterministic CPU simulation, <1% variance, automatic flamegraphs. Best for CPU-bound code. Does not measure system calls or I/O.
Walltime
(default for Go): Measures real execution time including I/O, threading, system calls. Best for I/O-heavy or multi-threaded code. Requires consistent hardware (use CodSpeed Macro Runners in CI).
Memory
Tracks heap allocations. Best for reducing memory usage. Supported for Rust, C/C++ with libc/jemalloc/mimalloc. Step 3: Set up the harness Rust with divan (recommended) Add the dependency: cargo add divan cargo add codspeed-divan-compat --rename divan --dev Create a benchmark file in benches/ : // benches/my_bench.rs use divan ; fn main ( ) { divan :: main ( ) ; }

[divan::bench]

fn bench_my_function ( ) { // Call the function you want to benchmark // Use divan::black_box() to prevent compiler optimization divan :: black_box ( my_crate :: my_function ( ) ) ; } Add to Cargo.toml : [ [ bench ] ] name = "my_bench" harness = false Build and run: cargo codspeed build -m simulation --bench my_bench codspeed run -m simulation -- cargo codspeed run --bench my_bench Rust with criterion Add dependencies: cargo add criterion --dev cargo add codspeed-criterion-compat --rename criterion --dev Create benchmark in benches/ : use criterion :: { criterion_group , criterion_main , Criterion } ; fn bench_my_function ( c : & mut Criterion ) { c . bench_function ( "my_function" , | b | { b . iter ( | | my_crate :: my_function ( ) ) } ) ; } criterion_group! ( benches , bench_my_function ) ; criterion_main! ( benches ) ; Add to Cargo.toml and build/run same as divan. Python with pytest-codspeed Install: pip install pytest-codspeed

or

uv add --dev pytest-codspeed Create benchmark tests:

tests/test_benchmarks.py

import pytest def test_my_function ( benchmark ) : result = benchmark ( my_module . my_function , arg1 , arg2 )

You can still assert on the result

assert result is not None

Or using the pedantic API for setup/teardown:

def test_with_setup ( benchmark ) : data = prepare_data ( ) benchmark . pedantic ( my_module . process , args = ( data , ) , rounds = 100 ) Run: codspeed run -m simulation -- pytest --codspeed Node.js with vitest (recommended) Install: npm install -D @codspeed/vitest-plugin

or

pnpm add -D @codspeed/vitest-plugin Configure vitest ( vitest.config.ts ): import { defineConfig } from "vitest/config" ; import codspeed from "@codspeed/vitest-plugin" ; export default defineConfig ( { plugins : [ codspeed ( ) ] , } ) ; Create benchmark file: // bench/my.bench.ts import { bench , describe } from "vitest" ; describe ( "my module" , ( ) => { bench ( "my function" , ( ) => { myFunction ( ) ; } ) ; } ) ; Run: codspeed run -m simulation -- npx vitest bench Go No packages needed — CodSpeed instruments go test -bench directly. Create benchmark tests: // my_test.go func BenchmarkMyFunction ( b * testing . B ) { for i := 0 ; i < b . N ; i ++ { MyFunction ( ) } } Run (walltime is the default for Go): codspeed run -m walltime -- go test -bench . ./ .. . C/C++ with Google Benchmark Install Google Benchmark (via CMake FetchContent or system package) Create benchmark:

include
static
void
BM_MyFunction
(
benchmark
::
State
&
state
)
{
for
(
auto
_
:
state
)
{
MyFunction
(
)
;
}
}
BENCHMARK
(
BM_MyFunction
)
;
BENCHMARK_MAIN
(
)
;
Build and run with CodSpeed:
cmake
-B
build
&&
cmake
--build
build
codspeed run
-m
simulation -- ./build/my_benchmark
Exec harness (any language)
For benchmarking whole programs without code changes:
Create
codspeed.yml
:
$schema
:
https
:
//raw.githubusercontent.com/CodSpeedHQ/codspeed/refs/heads/main/schemas/codspeed.schema.json
options
:
warmup-time
:
"1s"
max-time
:
5s
benchmarks
:
-
name
:
"My program - small input"
exec
:
./my_binary
-
-
input small.txt
-
name
:
"My program - large input"
exec
:
./my_binary
-
-
input large.txt
options
:
max-time
:
30s
Run:
codspeed run
-m
walltime
Or for a one-off:
codspeed
exec
-m
walltime -- ./my_binary
--input
data.txt
Step 4: Write good benchmarks
Good benchmarks are representative, isolated, and stable. Here are guidelines:
Benchmark real workloads
Use realistic input data and sizes. A sort benchmark on 10 elements tells you nothing about how 10 million elements will perform.
Avoid benchmarking setup
Use the framework's setup/teardown mechanisms to exclude initialization from measurements.
Prevent dead code elimination
Use
black_box()
(Rust),
benchmark.pedantic()
(Python), or equivalent to ensure the compiler/runtime doesn't optimize away the work you're measuring.
Cover the critical path
Benchmark the functions that matter most to your users — the ones called frequently or on the hot path.
Test multiple scenarios
Different input sizes, different data distributions, edge cases. Performance characteristics often change with scale.
Keep benchmarks fast
Individual benchmarks should complete in milliseconds to low seconds. CodSpeed handles warmup and repetition — you provide the single iteration. Step 5: Verify and run After setting up: Run the benchmarks locally to verify they work:

For language-specific harnesses

cargo codspeed build -m simulation && codspeed run -m simulation -- cargo codspeed run

or

codspeed run -m simulation -- pytest --codspeed

or

codspeed run -m simulation -- npx vitest bench

etc.

For exec harness

codspeed run
-m
walltime
Check the output
You should see a results table and a link to the CodSpeed report.
Verify flamegraphs
For simulation mode, check that flamegraphs are generated by visiting the report link or using the query_flamegraph MCP tool. Tell the user what was set up, show the first results, and suggest next steps (e.g., adding CI integration, running the optimize skill).
返回排行榜