Skip to content

nh-nixos: batch nix path-info calls in nh os info#637

Merged
NotAShelf merged 4 commits into
masterfrom
notashelf/push-pqkzzonykuwr
May 9, 2026
Merged

nh-nixos: batch nix path-info calls in nh os info#637
NotAShelf merged 4 commits into
masterfrom
notashelf/push-pqkzzonykuwr

Conversation

@NotAShelf
Copy link
Copy Markdown
Member

@NotAShelf NotAShelf commented May 6, 2026

Previously get_closure_size spawned a separate nix path-info process per generation. On systems with many generations this caused hundreds of sequential nix invocations. Which sucked.

Fixes #636

Change-Id: Ia9f00bc1f53835b026b0c145674761ca6a6a6964

Summary by CodeRabbit

  • New Features

    • Generation listing now shows closure sizes when available, using a single batched query for multiple generations.
    • Accepts precomputed size data when supplied; displays "Unknown" if size is unavailable.
  • Refactor

    • Optimized JSON parsing, size extraction, and human-readable GB formatting for more efficient and clearer generation listings.

Previously `get_closure_size` spawned a separate `nix path-info` process per
generation. On systems with many generations this caused hundreds of sequential
nix invocations. Which sucked.

Fixes #636

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ia9f00bc1f53835b026b0c145674761ca6a6a6964
@NotAShelf NotAShelf requested a review from faukah May 6, 2026 07:34
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 92ee5682-6960-4d3d-a34d-1fa69a19cbc0

📥 Commits

Reviewing files that changed from the base of the PR and between 38b72c0 and f3784d3.

📒 Files selected for processing (1)
  • crates/nh-nixos/src/generations.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/nh-nixos/src/generations.rs

Walkthrough

This PR introduces JSON helpers and a batched nix path-info -Sh --json call to compute closure sizes for multiple generation directories, refactors the single-getter to reuse those helpers, updates describe to accept an optional precomputed size, and wires callers to use the batch API where available.

Changes

Closure Size Batching

Layer / File(s) Summary
Helper Functions
crates/nh-nixos/src/generations.rs
Added closure_size_from_json to parse JSON from nix path-info output (handles both object and array forms) and bytes_to_gb_string to format bytes into human-readable GB strings.
Batch API
crates/nh-nixos/src/generations.rs
Added public get_closure_sizes_batch function that collects all generation paths, runs nix path-info -Sh --json once, parses results using closure_size_from_json, and returns a HashMap<PathBuf, String> of generation dir to size.
Core Refactoring
crates/nh-nixos/src/generations.rs
Refactored get_closure_size to delegate JSON extraction to closure_size_from_json and formatting to bytes_to_gb_string. Updated describe signature to accept closure_size: Option<String> parameter; uses precomputed size if provided, otherwise calls get_closure_size.
Integration
crates/nh-nixos/src/nixos.rs
Updated list_generations to pass None for the new describe parameter. Enhanced OsGenerationsArgs::info to compute all generation closure sizes in batch via get_closure_sizes_batch and pass looked-up values (or None) to describe.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • nix-community/nh#435: Modifies the same functions in crates/nh-nixos/src/generations.rs (refactoring get_closure_size and describe).
  • nix-community/nh#462: Also updates JSON parsing for closureSize (supports object/array nix path-info output) and touches get_closure_size.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: batching nix path-info calls in the nh os info command to improve performance.
Linked Issues check ✅ Passed The PR successfully addresses issue #636 by batching nix path-info calls instead of spawning separate processes per generation, directly reducing wall-clock time from ~95 seconds to ~7.84 seconds.
Out of Scope Changes check ✅ Passed All changes are directly related to batching closure size computations: refactoring get_closure_size, adding get_closure_sizes_batch, and updating function signatures are all within scope of the issue.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch notashelf/push-pqkzzonykuwr

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
crates/nh-nixos/src/generations.rs (2)

107-116: 💤 Low value

Use Map::get for direct lookup in the object branch.

The object branch iterates all keys to find a match; serde_json::Map supports direct key lookup, which is both simpler and avoids an O(N) scan when many paths share the same JSON document.

♻️ Proposed simplification
-  json.as_array().map_or_else(
-    || {
-      json.as_object().and_then(|obj| {
-        obj
-          .iter()
-          .find(|(path, _)| path.as_str() == store_path_str)
-          .and_then(|(_, value)| value.get("closureSize"))
-          .and_then(serde_json::Value::as_u64)
-      })
-    },
+  json.as_array().map_or_else(
+    || {
+      json.as_object().and_then(|obj| {
+        obj
+          .get(store_path_str)
+          .and_then(|value| value.get("closureSize"))
+          .and_then(serde_json::Value::as_u64)
+      })
+    },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/nh-nixos/src/generations.rs` around lines 107 - 116, The object branch
inside json.as_array().map_or_else currently scans obj.iter().find to match
store_path_str; replace that O(N) scan with a direct lookup using the Map::get
API (e.g., obj.get(store_path_str)) and then chain .and_then(|value|
value.get("closureSize")).and_then(serde_json::Value::as_u64) to extract
closureSize—update the code in the object closure (the one using
json.as_object().and_then(|obj| { ... })) to use obj.get(...) instead of
iter().find(...) for a simpler and faster lookup.

132-188: 💤 Low value

Batching design looks solid; correct fallback semantics.

The single nix path-info invocation with the JSON-keyed lookup is exactly the right shape for the issue described in #636. On batch failure (process or parse error) the empty map causes describe to fall back to per-generation get_closure_size, which preserves correctness at the cost of speed in the error path — acceptable.

One small optional perf note: each call to closure_size_from_json rescans the whole JSON, so the overall cost is O(N²) in the number of generations. At typical generation counts this is negligible, but if you wanted to make it O(N) you could pre-build a HashMap<&str, u64> from the JSON once and look up by store-path string per generation. Not blocking.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/nh-nixos/src/generations.rs` around lines 132 - 188, The current
get_closure_sizes_batch calls closure_size_from_json for each generation which
rescans the full serde_json::Value each time (O(N²)); to make it O(N) parse the
JSON once into a lookup map (e.g. HashMap<String, u64> or HashMap<&str, u64>)
keyed by the store_path string, then in the generation_dirs.iter().zip(...) loop
do a direct lookup in that map and convert found bytes to a string with
bytes_to_gb_string or use "Unknown" when missing; keep closure_size_from_json
for single-shot use but replace per-iteration JSON scanning in
get_closure_sizes_batch with the prebuilt map lookup.
crates/nh-nixos/src/nixos.rs (1)

1227-1233: ⚡ Quick win

list_generations still triggers per-generation nix path-info calls.

Passing None here causes describe to fall back to get_closure_size, which spawns one nix path-info per generation — the exact pattern this PR optimizes away in OsGenerationsArgs::info. list_generations is used by OsRollbackArgs::rollback, which never consumes closure_size from the returned GenerationInfo, so the resulting work is also wasted on the rollback path.

Two reasonable options:

  • Batch here too (mirrors info()): collect the candidate paths first, call generations::get_closure_sizes_batch, then pass Some(size) into describe.
  • Skip computation entirely for callers that don't need sizes: pass a placeholder like Some(String::new()) or Some("Unknown".into()) and avoid the nix invocation altogether. Longer-term, the describe API could be split (e.g. a separate describe_without_size or a "skip" sentinel) to make this distinction explicit.

The first approach keeps behavior identical for any future consumer of closure_size; the second avoids unnecessary process spawns on the rollback path right now.

♻️ Sketch: batch in list_generations
   let mut generations = Vec::new();
-  for entry in fs::read_dir(profiles_dir)? {
+  let entries: Vec<PathBuf> = fs::read_dir(profiles_dir)?
+    .filter_map(|entry| {
+      let entry = entry.ok()?;
+      let path = entry.path();
+      let name = path.file_name()?.to_str()?;
+      (name.starts_with("system-") && name.ends_with("-link")).then_some(path)
+    })
+    .collect();
+
+  let path_refs: Vec<&Path> = entries.iter().map(PathBuf::as_path).collect();
+  let closure_sizes = generations::get_closure_sizes_batch(&path_refs);
+
+  for path in &entries {
+    let size = closure_sizes.get(path).cloned();
+    if let Some(gen_info) = generations::describe(path, size) {
+      generations.push(gen_info);
+    }
+  }
-    let entry = match entry {
-      Ok(e) => e,
-      Err(e) => {
-        warn!("Failed to read entry in profile directory: {}", e);
-        continue;
-      },
-    };
-
-    let path = entry.path();
-    if let Some(name) = path.file_name().and_then(|s| s.to_str())
-      && name.starts_with("system-")
-      && name.ends_with("-link")
-      && let Some(gen_info) = generations::describe(&path, None)
-    {
-      generations.push(gen_info);
-    }
-  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/nh-nixos/src/nixos.rs` around lines 1227 - 1233, list_generations
currently calls generations::describe(&path, None) which causes per-generation
nix path-info calls; instead collect all candidate generation paths in
list_generations, call generations::get_closure_sizes_batch with that path list
to obtain sizes, and then call generations::describe(path, Some(size)) for each
generation so you avoid spawning one process per generation; update
list_generations to mirror OsGenerationsArgs::info's batching approach (and keep
generations::describe and OsRollbackArgs::rollback behavior unchanged).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/nh-nixos/src/generations.rs`:
- Around line 107-116: The object branch inside json.as_array().map_or_else
currently scans obj.iter().find to match store_path_str; replace that O(N) scan
with a direct lookup using the Map::get API (e.g., obj.get(store_path_str)) and
then chain .and_then(|value|
value.get("closureSize")).and_then(serde_json::Value::as_u64) to extract
closureSize—update the code in the object closure (the one using
json.as_object().and_then(|obj| { ... })) to use obj.get(...) instead of
iter().find(...) for a simpler and faster lookup.
- Around line 132-188: The current get_closure_sizes_batch calls
closure_size_from_json for each generation which rescans the full
serde_json::Value each time (O(N²)); to make it O(N) parse the JSON once into a
lookup map (e.g. HashMap<String, u64> or HashMap<&str, u64>) keyed by the
store_path string, then in the generation_dirs.iter().zip(...) loop do a direct
lookup in that map and convert found bytes to a string with bytes_to_gb_string
or use "Unknown" when missing; keep closure_size_from_json for single-shot use
but replace per-iteration JSON scanning in get_closure_sizes_batch with the
prebuilt map lookup.

In `@crates/nh-nixos/src/nixos.rs`:
- Around line 1227-1233: list_generations currently calls
generations::describe(&path, None) which causes per-generation nix path-info
calls; instead collect all candidate generation paths in list_generations, call
generations::get_closure_sizes_batch with that path list to obtain sizes, and
then call generations::describe(path, Some(size)) for each generation so you
avoid spawning one process per generation; update list_generations to mirror
OsGenerationsArgs::info's batching approach (and keep generations::describe and
OsRollbackArgs::rollback behavior unchanged).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 059dc0e0-f6a2-4e52-99ff-f40afced1e26

📥 Commits

Reviewing files that changed from the base of the PR and between 0ffe7af and 66eba0f.

📒 Files selected for processing (2)
  • crates/nh-nixos/src/generations.rs
  • crates/nh-nixos/src/nixos.rs

@dominikh
Copy link
Copy Markdown

dominikh commented May 6, 2026

New nh os info times on my system:

User time (seconds): 5.23
System time (seconds): 1.32
Percent of CPU this job got: 83%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:07.84

Definitely better than 1.5 minutes!

NotAShelf added 2 commits May 6, 2026 16:44
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I064ab02193b9f4c578ddeac68d4363ad6a6a6964
@NotAShelf NotAShelf force-pushed the notashelf/push-pqkzzonykuwr branch from 38b72c0 to f3784d3 Compare May 8, 2026 20:43
@NotAShelf NotAShelf enabled auto-merge May 9, 2026 17:25
@NotAShelf NotAShelf disabled auto-merge May 9, 2026 17:25
@NotAShelf NotAShelf merged commit 0f6e9a9 into master May 9, 2026
13 checks passed
@NotAShelf NotAShelf deleted the notashelf/push-pqkzzonykuwr branch May 9, 2026 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nh os info can be slow, has no progress bar, and wastes time

2 participants