feat: add 'process-compose analyze critical-chain' subcommand#462
feat: add 'process-compose analyze critical-chain' subcommand#462ryantm wants to merge 1 commit intoF1bonacc1:mainfrom
Conversation
Adds a new 'analyze' top-level command with a 'critical-chain' subcommand
that prints a tree of processes from top-level (nothing depends on them)
down through their dependencies, annotated with startup timings -- in the
spirit of 'systemd-analyze critical-chain'.
For each process two times are shown:
@<offset> Time after the project started that the process became ready.
For processes without a readiness signal this is the time the
process was launched.
+<duration> Time the process spent between launch and becoming ready.
Only shown for processes with a readiness probe / liveness
probe / 'ready_log_line'.
Example output:
app (not started) [Pending]
├─migrations (not started) [Pending]
│ ├─postgres @4min 30.148s +270.145s
│ └─setup @4ms (not ready)
├─postgres @4min 30.148s +270.145s
├─redis @2min 10.446s +130.435s
└─setup @4ms (not ready)
To support this, three optional timestamp fields are added to ProcessState:
- process_start_time: when the process first entered a launched/running
state.
- process_ready_time: when the process became ready (readiness probe
succeeded, ready_log_line matched, or -- for processes without any
readiness signal -- equal to process_start_time).
- process_end_time: when the process ended (completed / errored /
terminated / skipped).
These fields are populated in src/app/process.go from the existing
lifecycle hook points (run(), setProcHealth(), handleOutput() ready-log
match, and onProcessEnd()), are surfaced through the existing
GET /processes endpoint, and are JSON-omitempty so existing clients are
unaffected.
The critical-chain command itself is implemented purely on the client
side -- it composes the existing /project/state, /processes, and /graph
endpoints and walks the dependency graph (sorting siblings by descending
ready-time, like systemd does), so no new server-side endpoints are
introduced.
If process names are passed as positional arguments, only those processes
and their dependency sub-chains are printed; otherwise every top-level
process (plus any isolated processes) is printed.
|
F1bonacc1
left a comment
There was a problem hiding this comment.
Thanks @ryantm,
This can be a super useful feature for all PC users.
I left a few minor comments.
Please run make testrace after you address those, as there is one potential race condition that I flagged.
Do you think you can add some tests to cover the most critical sections?
BTW, I like it that you reused all the existing APIs, especially the graph!
| out := "" | ||
| for i, p := range parts { | ||
| if i > 0 { | ||
| out += " " | ||
| } | ||
| out += p | ||
| } | ||
| return out |
There was a problem hiding this comment.
| out := "" | |
| for i, p := range parts { | |
| if i > 0 { | |
| out += " " | |
| } | |
| out += p | |
| } | |
| return out | |
| return strings.Join(parts, " ") |
| Run: func(cmd *cobra.Command, args []string) { | ||
| if len(args) == 0 { | ||
| _ = cmd.Help() | ||
| os.Exit(0) | ||
| } | ||
| }, |
There was a problem hiding this comment.
| Run: func(cmd *cobra.Command, args []string) { | |
| if len(args) == 0 { | |
| _ = cmd.Help() | |
| os.Exit(0) | |
| } | |
| }, |
Cobra auto-prints help when a parent command has no Run and no subcommand is provided
| func readyOffsetForSort(s *types.ProcessState, projectStart time.Time) time.Duration { | ||
| if s == nil { | ||
| return time.Duration(1<<62 - 1) | ||
| } | ||
| if s.ProcessReadyTime != nil { | ||
| return s.ProcessReadyTime.Sub(projectStart) | ||
| } | ||
| if s.ProcessStartTime != nil { | ||
| return s.ProcessStartTime.Sub(projectStart) | ||
| } | ||
| return time.Duration(1<<62 - 1) | ||
| } |
There was a problem hiding this comment.
| func readyOffsetForSort(s *types.ProcessState, projectStart time.Time) time.Duration { | |
| if s == nil { | |
| return time.Duration(1<<62 - 1) | |
| } | |
| if s.ProcessReadyTime != nil { | |
| return s.ProcessReadyTime.Sub(projectStart) | |
| } | |
| if s.ProcessStartTime != nil { | |
| return s.ProcessStartTime.Sub(projectStart) | |
| } | |
| return time.Duration(1<<62 - 1) | |
| } | |
| func readyOffsetForSort(s *types.ProcessState, projectStart time.Time) time.Duration { | |
| const unreadySortOrder = time.Duration(math.MaxInt64) | |
| if s == nil { | |
| return unreadySortOrder | |
| } | |
| if s.ProcessReadyTime != nil { | |
| return s.ProcessReadyTime.Sub(projectStart) | |
| } | |
| if s.ProcessStartTime != nil { | |
| return s.ProcessStartTime.Sub(projectStart) | |
| } | |
| return unreadySortOrder | |
| } |
| for _, name := range args { | ||
| node, ok := graph.AllNodes[name] | ||
| if !ok { | ||
| fmt.Fprintf(os.Stderr, "unknown process (or process has no dependencies): %s\n", name) |
There was a problem hiding this comment.
That's true for the graph, but if the user does name an existing isolated process, the CLI could still render a timings line for it (you already do this when len(args) == 0). Consider:
node, ok := graph.AllNodes[name]
if !ok {
if _, exists := stateByName[name]; exists {
node = &types.DependencyNode{Name: name}
} else {
fmt.Fprintf(os.Stderr, "unknown process: %s\n", name)
os.Exit(1)
}
}
roots = append(roots, node)| p.procState.Health = types.ProcessHealthReady | ||
| if p.procState.ProcessReadyTime == nil { | ||
| now := time.Now() | ||
| p.procState.ProcessReadyTime = &now | ||
| } |
There was a problem hiding this comment.
We should protect this with the stateMtx, like you already do in other places, the easiest fix is just to reuse setProcHealth.
| p.procState.Health = types.ProcessHealthReady | |
| if p.procState.ProcessReadyTime == nil { | |
| now := time.Now() | |
| p.procState.ProcessReadyTime = &now | |
| } | |
| p.setProcHealth(types.ProcessHealthReady) |



Note this code was generated with AI assistance.
Summary
Adds a new top-level
analyzecommand, with a first subcommandanalyze critical-chain, inspired bysystemd-analyze critical-chain.It prints a tree of processes from the ones nothing depends on, down
through their dependencies, annotated with startup timings.
For each process two times are shown:
@<offset>— time after the project started that the process became ready(or was launched, for processes with no readiness signal).
+<duration>— time the process spent between launch and becoming ready.Only shown for processes with a readiness probe / liveness probe /
ready_log_line.Example
You can also restrict the output to a specific process (and its sub-chain):
What changed
src/types/process.go: three optional*time.Timefields onProcessState(process_start_time,process_ready_time,process_end_time). All areomitempty, so the JSON wire format isbackward-compatible for existing clients.
src/app/process.go: populates the new timestamps at existinglifecycle hook points:
run()— setsProcessStartTime. For processes with no readinesssignal (no readiness probe and no
ready_log_line) this is also usedas
ProcessReadyTime.setProcHealth(Ready)— setsProcessReadyTimewhen a readinessprobe succeeds.
handleOutput()— setsProcessReadyTimewhenready_log_linematches.
onProcessEnd()— setsProcessEndTime.src/cmd/analyze.go— new parentanalyzeCobra command.src/cmd/analyze_critical_chain.go—critical-chain [process...]subcommand. Implemented purely on the client side: composes the
existing
/project/state,/processes, and/graphendpoints, walksthe dependency graph, and sorts siblings by descending ready-time
(same order
systemd-analyzeuses). No new API endpoints.Motivation
When a process-compose project has many processes with deep dependency
chains, it can be hard to see which process is the startup bottleneck.
process listshows status and age, but doesn't show how long eachprocess took to reach the state that unblocks its dependents, and
doesn't walk the dependency graph.
analyze critical-chainanswers the question "what was the slow path?"at a glance.
Test plan
go build -v -o process-compose .on Go 1.26.go vet ./...clean.readiness-probe /
ready_log_line/ no-probe processes; confirmed:systemd-analyze.@and+values match the expected wall-clock deltas.(not ready)/(not started)and don't crash the output.[Skipped],[Error],[Restarting],[Launching], etc. statusannotations are surfaced.
sub-chains.
Backward compatibility
omitemptypointers, so existing API consumerssee no difference.
Open questions
Happy to iterate on any of these:
analyze critical-chainmirrorssystemd-analyze. Open toanalyze chain,analyze timing, etc. if preferred.fatih/colorlike other commands(
state,list). Can add a--no-colorflag / respectNO_COLORifdesired.
-o jsonmode would be astraightforward follow-up if there's interest.