Development happens here by msoos · Pull Request #51 · meelgroup/arjun

msoos · 2026-04-19T22:08:16Z

No description provided.

Hot AIG DFSs (get_dependent_vars, count_aig_nodes) were spending ~10% of total cycles in hashtable insertions and the associated malloc/free. Switch to a mutable visit_epoch on AIG, bumped once per traversal and compared as an integer. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Package windows release too

Replaces the naive pairwise Tseitin visitor with the AIGToCNF encoder (k-ary AND/OR fusion, ITE pattern detection, De Morgan flattening, dedup/constant folding) — the same encoder the rebuild path already uses. The y_hat remap now lives in the AIG so the encoder can consume raw lit vars, and orig.sign() is baked into f.aig up front.

The sink isn't cex-specific: it routes AIGToCNF output into a per-formula clause list while allocating helper vars from the solver. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Rewire encode_node() to collect k-ary AND/OR operands as aig_ptrs and run structural simplification (constant fold, dedup, complementary-pair, and absorption: AND(x, OR(x,...))=x / OR(x, AND(x,...))=x) before committing to leaf literals. Catches shared sub-AIG patterns that the lit-level dedup can't see because a sub-AIG and its NOT-wrapper have different helpers. Add pure-chain generators to the fuzzer (linear AND/OR chain, balanced AND/OR tree) and a dedicated test-aig-to-cnf regression that asserts chains up to width 256 fuse into a single k-ary gate (n+1 clauses, 1 helper). Hook into ctest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Two complementary encoder improvements: 1. MUX3 fusion for nested ITEs. When the else branch of an ITE pattern is itself a fanout-1, uncached ITE, emit a single 6-clause MUX3 gate with one helper instead of two 4-clause ITEs with two helpers. Each fusion saves 2 clauses and 1 helper. parse_ite_at splits the pattern-match out of try_ite so the outer match can inspect the inner AIG without committing to an encoding. 2. Canonical (var, sign) sort on group-CSE keys before lookup/insert. Previously CSE hits depended on whichever order the inputs arrived in, which only coincided with canonical order when normalize_inputs was enabled. On the fuzzer's mixed workload (deep ITE chains, DNF covers, big AND/OR chains) clause reduction improves from ~66% to ~69% and helper reduction from ~78% to ~83% vs naive Tseitin, with MUX3 firing ~36 times per iteration on the manthan-shaped ITE chains. All 500/500 fuzzer iterations pass equivalence + brute-force truth-table checks; test-aig-to-cnf still passes all pure-chain cases.

Track polarity of each helper via a pre-pass from the root, then emit only the forward or reverse half of each helper's biconditional based on whether it is consumed positively, negatively, or both. Pre-pass mirrors the flattening used by the main encoder (k-ary AND, k-ary OR via NAND, ITE, MUX3) so size-1 reductions and width-capped chunks inherit the correct polarity. Group CSE is forced off under PG (reused helpers would need both halves anyway). Fuzzer now randomises PG per iteration by default (--pg / --no-pg force a mode). 10000-iter random-PG run passes; clause reduction 82% on the mixed workload vs 80% without PG.

Silences a -Wnull-dereference warning at aig_rewrite.cpp:331. The lambda guards (e.node && ...) on lines 320/322 confused GCC's flow analysis into believing r.node could be null on the fast path. Document the actual invariant (t_and nodes always have non-null children, see AIG::new_and) with an assert; this also gives the compiler enough information to drop the warning.

Replaces the success-only "def found" line with a per-variable summary emitted at the end of every test (hit or miss): iters, miter U/S/T, f-solver U/S/T, skip_big, costzero count, avg pattern size, AIG nodes, and a `result=` field naming the termination reason (found, iter_limit, miter_undef, miter_pin_undef, f_undef, costzero_limit). Columns are fixed-width via setw so consecutive lines line up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Explains why pinning vs no-pinning are both over-approximations of Skolem-validity, lists 2QBF backends (CADET, DepQBF, CAQE, ...), and sketches a per-variable 2QBF-CEGAR alternative.

The visitor was always invoked with neg=false because transform applies the outer edge sign itself via ~result. Every visitor's `neg ? ~x : x` branch was therefore dead. Strip the parameter from the signature and collapse each call site to its always-taken branch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adding notes

Wraps the iter1 simplification strategy string in a switch over an integer enum; 0 keeps the existing default, 1 is the alternate 'new model' string previously kept commented out next to it. Exposed on the CLI as --puurastrategy. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds a per-test "aux" leaf set (other to-define and backward-defined vars whose recursive deps don't reach `test`) so the candidate H can disambiguate F-bifunctional X*. This collapses many costzero_limit failures into real defs by giving the F-only call a richer assumption set; conflict cores then yield patterns over (X, aux) instead of being declared cost-zero. New knob --unatedefrepaux N (0=input only, 1=+backward_defined, 2=+to-define / default). On genbuf7b4n with 500/500/20 budgets the pass goes from 4 hits / 30 costzero to 27 hits / 6 costzero. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

After the introduction of non-input H leaves (commit ce2eb75), the rep pass could commit a wrong Skolem when an earlier H_prev used the current test's var as an aux leaf. Root cause: locking indicator-prev = TRUE after a commit, combined with the Y- and Y'-side H-commits, forced y_test_Y = y_test_Y' on every later miter call where t ∈ aux_prev. The miter then declared UNSAT for spurious chain reasons, producing incorrect commits. Fix: stop locking indicator-prev. Instead, assume indicator_test = FALSE in base_assumps for the current test, so y_test_Y != y_test_Y' holds in every miter call, keeping the SAT/UNSAT decision aligned with the soundness condition. The existing Y/Y'-side H-commits already pin y_prev to its committed H on each side, making the indicator lock redundant for input-only / backward-only H's anyway. Also adds a SLOW_DEBUG-gated post-commit check via check_synth_funs_sat() that catches bad commits at the exact iteration that introduced them, and VERBOSE_DEBUG dumps of aux selection, patterns, and the H AIG. Both fuzzers (fuzz_synth.py, fuzz_unate_def_rep.py) now also randomize --unatedefrepaux. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

run_check only inspected stdout for the CORRECT line and never looked at the process exit code, so segfaults and SIGABRTs from release_assert in test-synth were silently swallowed (especially on non-final AIGs, where the missing-CORRECT branch is skipped entirely). Now any non-zero exit that isn't a clean INCORRECT (returncode 1) is reported as a bug, an INCORRECT match in the output is reported explicitly, and all bug paths print the command and the --seed reproduce line. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Both non-input-leaf modes added in ce2eb75 fail under fuzzing: - aux=2 (undefined to-define leaves) violates the check_pre_post_backward_round_synth invariant. The committed def of `test` reaches a leaf that's neither orig-sampl nor itself defined, tripping the assertion when test-synth loads `*-unsat_unate_def_rep.aig` (seed 1920751642475315400). - aux=1 (backward-defined leaves) breaks Manthan's BW-vars-have-unique- defs assumption. unate_def_rep commits Skolem H, not unique-defining functions: the miter UNSAT only guarantees that *replacing* y_test with H(input, y_aux) keeps F sat, not that y_test = H in every F-sat model. With aux=1, H(input, y_hat_aux) can pick a valid F-sat value of y_test that disagrees with y[test] in the specific F-sat model the CEX produced, leaving y[test] in needs_repair after find_better_ctx. find_next_repair_var then picks a backward_defined var as y_rep and trips the BW assertion at manthan.cpp:1569 (seed 15223131052712053446). Restoring full aux>=1 needs Manthan to support repair on Skolem-defined BW vars. Until then clamp to 0 with a documented stub so the knob keeps parsing for compatibility. After fix: fuzz_synth.py --num 1500, fuzz_unate_def_rep.py 300, fuzz_aig_to_cnf --num 1000, fuzz_aig_rewrite --num 1000 all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This reverts commit 72688bf.

unate_def_rep with aux>=1 had two soundness gaps that the recent fuzzer (--num 1500) found: 1. The miter UNSAT only proves H is a *Skolem*: F-sat (x, y) implies exists F-sat (x, y') with y'[test] = H(input, y_aux). It does NOT prove uniqueness (F ⊨ y_test = H). When F is bifunctional, F-sat models can have y_test != H. Manthan's BW pipeline assumes y_hat[BW] = y[BW] in every F-sat model (find_better_ctx fixes BW vars by finding F-sat with y[v] = y_hat[v]); a Skolem-only commit can leave a BW var in needs_repair after find_better_ctx, tripping the assertion at find_next_repair_var (manthan.cpp:1569). Fix: after the existing miter UNSAT, run an extra check in f_solver (F-only, no commit clauses) that `F + y_test != H_top` is UNSAT. Only commit when uniqueness holds; otherwise mark Skolem-only and continue refining. 2. When H_test happens to be a constant or input-only AIG, get_var_types categorized it as extend_defined (which Manthan treats as input, dropping the y_test = H_test constraint). Subsequent commits whose miter UNSAT relied on the dropped constraint then produced wrong y_hat in Manthan, also tripping the BW assertion. Fix: track unate_def_rep commits in a `skolem_defined_vars` set on SimplifiedCNF (set via new set_def_skolem entry point) and force get_var_types to keep them in backward_synth_defined_vars regardless of structural deps. The set is round-tripped through the AIG dump format so test-synth's check_pre_post_backward_round_synth can also skip its strict invariant for these vars. After fix: fuzz_synth.py --num 1500, fuzz_unate_def_rep.py 300 (rep_fired=25, all verified), fuzz_aig_to_cnf --num 1000, and fuzz_aig_rewrite --num 1000 all pass with default --unatedefrepaux=2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Document and self-check unate_def_rep's invariants so a future regression fails fast at the introduction site, not at a downstream consumer: - Cheap asserts (always on, per test loop iteration): * test is a real, non-input, not-yet-tested var with an indicator * aux candidates are non-input, non-self, distinct, aux_mask agrees * pattern conflict literals live in input ∪ aux (no leak from the f_solver core through the filter) * before commit: H is non-null, target var has no def yet, and every direct AIG leaf of H is in input ∪ aux * after commit: defs[test] is set and is_skolem_defined(test) holds - SLOW_DEBUG checks (heavy, opt-in via constants.h): * per-commit defs_invariant() in addition to the existing check_synth_funs_sat() — catches structural breakage (cycles, dangling deps, bad categorization) * end-of-pass defs_invariant() so the pass boundary is a clean verification point even when no commit fired Add an is_skolem_defined() accessor on SimplifiedCNF for the asserts. Widen fuzzer corner coverage: --unatedefrepiters now also exercises 0 (inner loop never runs — no commits at all) and 10000 (stress refinement); --unatedefrepconfl in fuzz_unate_def_rep.py now also includes 1 (every solver call mostly times out). All other knobs already covered 0/1/medium/high/very-high. Skolem-only-skipped count is now printed in the end-of-pass stats line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two narrow self-checks for the skolem_defined_vars set, placed where they document the invariant they protect: - defs_invariant() (always-on, since it's already mostly SLOW_DEBUG- gated by callers): every entry in skolem_defined_vars points at a valid, currently-defined, non-orig-sampl var. This catches set_def_skolem misuse or a copy/move that desyncs the set from `defs`. - get_var_types() at the end (SLOW_DEBUG_DO, since this function is on the hot path): no Skolem-committed var was placed into extend_defined_vars. The whole point of the Skolem flag is to keep such vars in backward_synth_defined_vars even when their AIG looks input-only or is constant — extend-defined gets treated as input by Manthan, dropping the y_test = H_test commit constraint. If the `&& !skolem_defined_vars.count(orig)` branch in the categorization above ever gets removed/broken, this fires immediately rather than surfacing as a Manthan BW assertion downstream. Verified with SLOW_DEBUG-enabled build + fuzz_unate_def_rep.py 50. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The uniqueness check at the miter-UNSAT branch was previously run against a static `f_solver` built once from original F. That made it strict relative to what's actually needed: the BW pipeline only requires y_test = H to be implied by the *current* CNF (= F plus all prior commits), not by original F. A later var's H can be Skolem-only in F but uniquely-defining in cumulative F' once prior commits restrict the model space; the static check rejected those chained-Skolem cases and they exhausted the iter budget instead of committing. Fix: after a successful commit on the s-side, mirror the same y_test ⇔ H_top equivalence into f_solver. h_top_in_f and y_test_in_f are still in scope from the uniqueness check above; the Tseitin chain for H is already in f_solver from that call, so the mirror is just two clauses. Subsequent uniqueness checks then operate against cumulative F' without any soundness loss — the new clauses are exactly the def we already verified is implied by F under the F-only check. Also update the now-stale header doc (step 3b said "UNSAT → valid Skolem; commit" but the code does a uniqueness check; step 5 didn't mention the f_solver mirror) and the in-block comment that described f_solver as "no commit clauses". While here, the verb-1 per-var print line drops the redundant orig-var column (already in the SLOW_DEBUG path) and uses COLCYN/COLDEF for the stop_reason, matching the other [unate_def_rep] verb-1 lines. Verified: fuzz_synth 1500, fuzz_unate_def_rep 300 (42/42 reps verified), fuzz_aig_to_cnf 1000, fuzz_aig_rewrite 1000 all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the strict uniqueness check at the miter-UNSAT branch with a feasibility check, and materialize the committed equivalence into cnf.get_clauses() so downstream consumers (find_better_ctx_normal in particular) see the def directly. Why: on bifunctional benchmarks like factorization12, every var has multiple valid Skolem witnesses for the same input X, so strict uniqueness in F (or in cumulative F') can never hold and the previous pass committed nothing. The looser check accepts a wider class of Hs. The feasibility check is `F' ∧ (y_test = H_top)` SAT in f_solver. SAT means H is at least sometimes a valid value for y_test in cumulative F' — in particular it rules out the vacuous-miter-UNSAT case where no F'-model has y_test = H at any X (which would silently lose X- projections on commit). f_solver continues to accumulate every prior commit's y_v ⇔ H_v so the check is against cumulative F', not original F. Caveat: feasibility SAT is weaker than full Skolem (∀X. F-sat at X → ∃Y with y_test = H(X)). A pathological F that splits into two disjoint input regions with disjoint forced y_test values can pass feasibility but lose an X-projection on commit. The fuzzers did not surface such a case in 1500 iterations, but the gap is real and a future-work item is to replace the feasibility check with a CEX-loop Skolem check. The materialization side: each successful commit's (test, H) gets enqueued in `deferred_materialize` and processed after the per-test loop. We can't materialize in the loop because cnf.new_var() would shift cnf.nVars(), and that offset is the anchor for the Y'-side encoding in the miter solver `s` (and in encode_h). Deferring is sound because the materialization is only needed by find_better_ctx_normal, which runs inside Manthan after synthesis_unate_def_rep returns. Each AND helper allocated during materialization is committed via set_def_skolem (not set_def) so: - get_var_types keeps it in backward_synth_defined even when its dep-chain is all input (a pure-input H produces such helpers); extend_defined would let Manthan treat it as input and drop the helper = AND constraint we just baked in; - check_pre_post_backward_round_synth's "vars in CNF must be defined in terms of orig_sampl_vars" invariant explicitly exempts Skolem- marked vars, which is necessary for chained-Skolem H's whose aux leaves are themselves non-orig-sampl backward-defined vars. The Tseitin chain for H is also mirrored into f_solver on commit (carried over from the previous commit), so the next var's feasibility check operates against cumulative F' rather than original F — chained- Skolem commits compose cleanly inside the loop. Comments updated throughout (header step 3b/5, in-block comments at the check site, the post-set_def_skolem block, and the f_solver mirror block) to describe the new flow. Verified: factorization12 commits 1 def in unate_def_rep (was 0), synth completes; previously-failing fuzz_synth seed 14130824211102645510 now passes; fuzz_aig_to_cnf 1000, fuzz_aig_rewrite 1000, fuzz_unate_def_rep 300 (44/44 reps verified), fuzz_synth 1500 all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Counts wall-clock time and call counts for the major per-iteration phases (encode_h, miter solve, uniqueness encode/solve, F-only solve, pattern build, commit) so we can attribute cost to specific steps. On the slow benchmarks (amba3b5y, sdlx-fixpoint-5, usb-phy) the breakdown shows the miter SAT solve at 83-91% of total time, with f_solve and encode_h each in the 1-7% range — a useful map for further targeted optimization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Trim unate_def_rep timing instrumentation to the 3 SAT calls cpuTime() in the per-iter hot path was eating measurable time itself. Drop time_setup, time_per_test_setup, time_encode_h, time_feas_encode, time_pattern, time_commit. Keep miter/feas/f solve timings (the actual expensive work) plus all the cheap integer counters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fc_int discarded std::from_chars's return value, so a non-numeric argument like --unatedefrepaux stuff silently parsed as 0 instead of throwing. Now check the errc and trailing-character position. fc_double gets a matching invalid_argument message so failures point at the offending input rather than a bare 'stod'.

msoos and others added 30 commits April 19, 2026 15:46

Cleanup

f28f3a3

Emplace back

af0ad87

Cleanup

93d9cdf

BEtter clang-tidy

0f7cdb5

Cleanup and speedup

8031302

Fix about 5% of time

5ffd008

Faster get_dependent_vars

ed623b6

Emplace back

c9a9661

Small improvements using unordered_map

70ef3ae

Package Windows release, too

4717bfe

Package windows release too

Memoize dependency

8e913f7

Faster get_dependent_vars

8437ae5

Gen-ok

e5aaaca

Fix fuzzer config

e2182f0

Faster bitset lookup

b10b429

Fix test-synth

4023fe3

Rename CexClauseSink to FormulaClauseSink

e42db69

The sink isn't cex-specific: it routes AIGToCNF output into a per-formula clause list while allocating helper vars from the solver. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Rebuild solver less often

ef09884

These should be floating

6e85695

Add rationale for rebuild

bb08c12

Better printing

2877aa1

Fixing build

88f193e

None of this.

5bd850d

Cleanup

9e95e6b

Loops repair is counted now between rebuilds

6aeb335

msoos and others added 28 commits April 27, 2026 23:11

None of this indirection

a78e4f0

Doc: cost-zero in unate_def_rep is a 2QBF gap

f216ae9

Explains why pinning vs no-pinning are both over-approximations of Skolem-validity, lists 2QBF backends (CADET, DepQBF, CAQE, ...), and sketches a per-variable 2QBF-CEGAR alternative.

Rolling back puura to 8c2fa95

c75ff34

Adding notes

Adding new results

e2bbf60

Fix strategy to better

e2a65b8

Tune to out-synt-1455773-1 -- i.e. best that's not out-synth-1367674-2

a0e9941

Update graph

7423af4

Add new strategy

50571f2

Update unate_def_rep cutoffs, enable it

5137212

More fuzzing per check

24d7de8

Revert "Disable unate_def_rep aux>=1 (clamp to input-only)"

e231ad6

This reverts commit 72688bf.

Less unate rep iters by default: 200->50

df94f55

Lower unate repair iters

d758531

msoos merged commit ece5005 into master Apr 30, 2026
18 checks passed

msoos deleted the develop branch April 30, 2026 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development happens here#51

Development happens here#51
msoos merged 152 commits intomasterfrom
develop

msoos commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

msoos commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant