Skip to content

Restore per-transaction .csv.zst output via --full flag#475

Open
russfellows wants to merge 4 commits intominio:masterfrom
russfellows:fix/default-tsv-output
Open

Restore per-transaction .csv.zst output via --full flag#475
russfellows wants to merge 4 commits intominio:masterfrom
russfellows:fix/default-tsv-output

Conversation

@russfellows
Copy link
Copy Markdown

bench: restore per-transaction .csv.zst output via --full flag

Summary

Prior to commit 483f844, warp wrote a per-transaction CSV/TSV file after every
benchmark run, enabling fine-grained post-hoc analysis with warp analyze. That
commit introduced addCollector() but registered the --full flag only in
analyzeFlags — never in benchFlags — so the flag silently had no effect and
the per-transaction file was never written.

This PR restores the intended behavior with a minimal, targeted fix across three
files, adds unit tests, and updates documentation.


Problem

After 483f844, running any benchmark with --full:

warp put --host=... --full

produced only a .json.zst aggregate file. The .csv.zst per-transaction file was
never written, making post-hoc per-operation analysis impossible. The flag appeared
to be accepted but was completely ignored.


Changes

cli/benchmark.go — Core fix (single-node mode)

addCollector() now gates on ctx.Bool("full"):

  • Without --full (default): A LiveCollector only. Live throughput display and
    auto-termination work as before. Output: .json.zst only.
  • With --full: Ops are fanned to both a NewOpsCollector (capturing every
    individual request) and the LiveCollector (for live display). After the benchmark,
    .csv.zst is written first, then .json.zst is also written. --full is strictly
    additive — both files are always produced together.

A duplicate --full flag registration in benchFlags that caused a startup panic
(since --full already lives in analyzeFlags, which is combined with benchFlags
for all benchmark commands) has been removed.

cli/benchserver.go — Distributed mode mirror

  • With --full: Downloads raw ops → writes .csv.zst → downloads aggregate →
    writes .json.zst.
  • Without --full: Downloads aggregate → writes .json.zst only.

cli/analyze.go — User-facing warning

warp analyze file.csv.zst without --full re-aggregates individual operations into
1-second buckets before reporting, producing results equivalent to reading the
.json.zst aggregate. This loses per-operation resolution — latency percentiles and
throughput figures can differ, and per-request filtering flags have no additional
effect.

A warning is now printed in this case:

WARNING: Analyzing .csv.zst without --full produces aggregated results (1-second
         buckets), not accurate per-operation statistics.
         For precise latency percentiles and throughput, use: warp analyze --full <file>

The warning is suppressed when --quiet or --json output is active.

Note: Passing --full to warp analyze on a .json.zst file is silently
ignored — aggregate files contain no individual operations.

cli/addcollector_test.go — New unit tests (7 tests)

Test Validates
TestAddCollectorDefaultNoOps Default mode stores no individual ops
TestAddCollectorFullCollectsOps --full captures all operations
TestAddCollectorFullFansOutToLive --full also feeds the LiveCollector
TestAddCollectorDefaultUpdatesChannel Default mode updates channel is functional
TestAddCollectorDiscardOutput DiscardOutput suppresses collection in both modes
TestAddCollectorFullIsAdditive --full always produces both files
TestAddCollectorOpValues Op field values are preserved through the collector

README.md — Documentation

Added an Analyzing Results subsection under Full Per-Transaction Logging
documenting:

  • Use warp analyze --full file.csv.zst for accurate per-operation stats
  • Default analyze on .csv.zst re-aggregates (same as .json.zst) and now warns
  • --full is silently ignored when analyzing .json.zst

Behavior Before / After

Scenario Before this PR After this PR
warp put (no flags) .json.zst written .json.zst written ✓
warp put --full .json.zst written (bug) .csv.zst and .json.zst written ✓
warp analyze file.csv.zst Silent re-aggregation Re-aggregation with warning
warp analyze --full file.csv.zst Full per-op analysis Full per-op analysis ✓
warp analyze --full file.json.zst Ignored silently Ignored silently (noted in docs) ✓

Testing

Validated against a real MinIO S3 cluster. With --full:

$ ls -lh warp-put-*.{csv,json}.zst
-rw-r--r-- 1 user user 142K warp-put-2026-04-07[162451]-abcd.csv.zst
-rw-r--r-- 1 user user 1.2K warp-put-2026-04-07[162451]-abcd.json.zst

$ warp analyze warp-put-2026-04-07[162451]-abcd.csv.zst
WARNING: Analyzing .csv.zst without --full produces aggregated results...

$ warp analyze --full warp-put-2026-04-07[162451]-abcd.csv.zst
# → exact per-op percentiles, 4x1s throughput windows

All 7 unit tests pass: go test ./cli/ -run TestAddCollector -v

Copy link
Copy Markdown
Collaborator

@klauspost klauspost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It is overly dramatic, though - and lacks mentioning the main downside.

Comment thread cli/analyze.go
Comment thread README.md
@russfellows
Copy link
Copy Markdown
Author

russfellows commented Apr 16, 2026 via email

Prior to commit 483f844, warp wrote a per-transaction CSV/TSV file
after every benchmark run. That commit introduced addCollector() but
registered --full only in analyzeFlags, never in benchFlags, so the
flag had no effect and the per-transaction file was never written.

This commit restores the intended behavior:

* addCollector() now gates on ctx.Bool("full"). Without --full, only a
  LiveCollector (aggregating only) is used and .json.zst is the sole
  output — same default behavior as pre-483f844.

* With --full, ops are fanned to both a NewOpsCollector (capturing
  every individual request) and the LiveCollector (live display and
  auto-termination). After the benchmark, .csv.zst is written first,
  then .json.zst is also written — so --full is strictly additive;
  both files are always produced together.

* benchserver.go (distributed mode) mirrors this: with --full it
  downloads raw ops, writes .csv.zst, then downloads the aggregate
  and writes .json.zst. Without --full it writes .json.zst only.

* The --full flag already exists in analyzeFlags, which is combined
  with benchFlags for all benchmark commands. A duplicate --full
  registration in benchFlags that caused a startup panic has been
  removed.

* warp analyze now prints a warning when a .csv.zst is analyzed
  without --full: re-aggregating ops into 1-second buckets loses
  per-operation resolution, producing results equivalent to reading
  the .json.zst. Users are directed to add --full for precise latency
  percentiles and throughput figures. The warning is suppressed when
  --quiet or --json output is active.

* README updated with an "Analyzing Results" subsection documenting
  the difference between default and --full analysis paths, and
  noting that --full is silently ignored on .json.zst files (which
  contain no individual operations).

* 7 unit tests added in cli/addcollector_test.go covering: default
  mode stores no ops, --full collects all ops, --full fans out to
  LiveCollector, default mode updates channel is functional,
  DiscardOutput suppresses both modes, --full is additive (both files
  written), and op value preservation.
…ad docs

- cli/analyze.go: soften warning from 'WARNING:' to 'Note:'; remove
  trailing newline from Println arg (fixes govet)
- cli/addcollector_test.go: fix misspell 'behaviour' -> 'behavior';
  remove redundant embedded field selector b.Common.Collector -> b.Collector
  (fixes staticcheck QF1008)
- README.md: replace vague memory/perf caution with measured data —
  table of RSS overhead at three concurrency/duration levels (~310 B/op
  at low load, ~850 B/op at high concurrency), explanation of Go GC
  arena growth, and analysis time comparison (csv.zst --full is ~5x
  slower than json.zst; re-aggregating without --full is ~13x slower)
@russfellows russfellows force-pushed the fix/default-tsv-output branch from e6b62e1 to 8f5a66f Compare April 19, 2026 18:27
@russfellows
Copy link
Copy Markdown
Author

Thanks for the feedback. Pushed an updated commit that:

  • Softens the analyze.go warning from WARNING: to Note: and shortens it to two lines (also fixes the trailing \n flagged by govet)
  • Fixes behaviour → behavior in the test file (misspell) and removes the redundant b.Common.Collector selector (staticcheck QF1008)
  • Replaces the vague memory note in the README with actual measurements from a local MinIO server — the overhead is negligible for short runs (+9 MB at 30s/8c) but grows to +124 MB at 60s/32c and projects to ~8 GB for a 1-hour high-concurrency run, so the caution is warranted at scale

Copy link
Copy Markdown
Collaborator

@klauspost klauspost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Restores the intended --full behavior for benchmarks so per-transaction .csv.zst output is produced again (alongside the aggregate .json.zst), and adds a user-facing warning when analyzing .csv.zst without --full.

Changes:

  • Reworks collector wiring so --full collects per-op data while preserving live aggregation for UI/autoterm.
  • Mirrors --full behavior for remote benchmarks by downloading ops for .csv.zst and still writing .json.zst.
  • Adds an analyze-time warning for .csv.zst without --full, plus new unit tests and README updates.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
cli/benchmark.go Wires --full to collect ops + keep live updates; writes .json.zst alongside .csv.zst in full mode.
cli/benchserver.go In remote mode, downloads ops for .csv.zst when --full and also writes the aggregate .json.zst.
cli/analyze.go Emits a warning when analyzing .csv.zst without --full (unless quiet/JSON).
cli/addcollector_test.go Adds unit tests covering default/full/discard-output collector behavior.
README.md Documents --full outputs and analysis behavior differences for .csv.zst vs .json.zst.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cli/benchmark.go
Comment thread cli/benchserver.go
Comment thread README.md
russfellows and others added 2 commits April 20, 2026 07:48
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants