Conversation
Hot AIG DFSs (get_dependent_vars, count_aig_nodes) were spending ~10% of total cycles in hashtable insertions and the associated malloc/free. Switch to a mutable visit_epoch on AIG, bumped once per traversal and compared as an integer. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Package windows release too
Replaces the naive pairwise Tseitin visitor with the AIGToCNF encoder (k-ary AND/OR fusion, ITE pattern detection, De Morgan flattening, dedup/constant folding) — the same encoder the rebuild path already uses. The y_hat remap now lives in the AIG so the encoder can consume raw lit vars, and orig.sign() is baked into f.aig up front.
The sink isn't cex-specific: it routes AIGToCNF output into a per-formula clause list while allocating helper vars from the solver. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Rewire encode_node() to collect k-ary AND/OR operands as aig_ptrs and run structural simplification (constant fold, dedup, complementary-pair, and absorption: AND(x, OR(x,...))=x / OR(x, AND(x,...))=x) before committing to leaf literals. Catches shared sub-AIG patterns that the lit-level dedup can't see because a sub-AIG and its NOT-wrapper have different helpers. Add pure-chain generators to the fuzzer (linear AND/OR chain, balanced AND/OR tree) and a dedicated test-aig-to-cnf regression that asserts chains up to width 256 fuse into a single k-ary gate (n+1 clauses, 1 helper). Hook into ctest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two complementary encoder improvements: 1. MUX3 fusion for nested ITEs. When the else branch of an ITE pattern is itself a fanout-1, uncached ITE, emit a single 6-clause MUX3 gate with one helper instead of two 4-clause ITEs with two helpers. Each fusion saves 2 clauses and 1 helper. parse_ite_at splits the pattern-match out of try_ite so the outer match can inspect the inner AIG without committing to an encoding. 2. Canonical (var, sign) sort on group-CSE keys before lookup/insert. Previously CSE hits depended on whichever order the inputs arrived in, which only coincided with canonical order when normalize_inputs was enabled. On the fuzzer's mixed workload (deep ITE chains, DNF covers, big AND/OR chains) clause reduction improves from ~66% to ~69% and helper reduction from ~78% to ~83% vs naive Tseitin, with MUX3 firing ~36 times per iteration on the manthan-shaped ITE chains. All 500/500 fuzzer iterations pass equivalence + brute-force truth-table checks; test-aig-to-cnf still passes all pure-chain cases.
Track polarity of each helper via a pre-pass from the root, then emit only the forward or reverse half of each helper's biconditional based on whether it is consumed positively, negatively, or both. Pre-pass mirrors the flattening used by the main encoder (k-ary AND, k-ary OR via NAND, ITE, MUX3) so size-1 reductions and width-capped chunks inherit the correct polarity. Group CSE is forced off under PG (reused helpers would need both halves anyway). Fuzzer now randomises PG per iteration by default (--pg / --no-pg force a mode). 10000-iter random-PG run passes; clause reduction 82% on the mixed workload vs 80% without PG.
Silences a -Wnull-dereference warning at aig_rewrite.cpp:331. The lambda guards (e.node && ...) on lines 320/322 confused GCC's flow analysis into believing r.node could be null on the fast path. Document the actual invariant (t_and nodes always have non-null children, see AIG::new_and) with an assert; this also gives the compiler enough information to drop the warning.
Replaces the success-only "def found" line with a per-variable summary emitted at the end of every test (hit or miss): iters, miter U/S/T, f-solver U/S/T, skip_big, costzero count, avg pattern size, AIG nodes, and a `result=` field naming the termination reason (found, iter_limit, miter_undef, miter_pin_undef, f_undef, costzero_limit). Columns are fixed-width via setw so consecutive lines line up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Explains why pinning vs no-pinning are both over-approximations of Skolem-validity, lists 2QBF backends (CADET, DepQBF, CAQE, ...), and sketches a per-variable 2QBF-CEGAR alternative.
The visitor was always invoked with neg=false because transform applies the outer edge sign itself via ~result. Every visitor's `neg ? ~x : x` branch was therefore dead. Strip the parameter from the signature and collapse each call site to its always-taken branch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adding notes
Wraps the iter1 simplification strategy string in a switch over an integer enum; 0 keeps the existing default, 1 is the alternate 'new model' string previously kept commented out next to it. Exposed on the CLI as --puurastrategy. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a per-test "aux" leaf set (other to-define and backward-defined vars whose recursive deps don't reach `test`) so the candidate H can disambiguate F-bifunctional X*. This collapses many costzero_limit failures into real defs by giving the F-only call a richer assumption set; conflict cores then yield patterns over (X, aux) instead of being declared cost-zero. New knob --unatedefrepaux N (0=input only, 1=+backward_defined, 2=+to-define / default). On genbuf7b4n with 500/500/20 budgets the pass goes from 4 hits / 30 costzero to 27 hits / 6 costzero. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
After the introduction of non-input H leaves (commit ce2eb75), the rep pass could commit a wrong Skolem when an earlier H_prev used the current test's var as an aux leaf. Root cause: locking indicator-prev = TRUE after a commit, combined with the Y- and Y'-side H-commits, forced y_test_Y = y_test_Y' on every later miter call where t ∈ aux_prev. The miter then declared UNSAT for spurious chain reasons, producing incorrect commits. Fix: stop locking indicator-prev. Instead, assume indicator_test = FALSE in base_assumps for the current test, so y_test_Y != y_test_Y' holds in every miter call, keeping the SAT/UNSAT decision aligned with the soundness condition. The existing Y/Y'-side H-commits already pin y_prev to its committed H on each side, making the indicator lock redundant for input-only / backward-only H's anyway. Also adds a SLOW_DEBUG-gated post-commit check via check_synth_funs_sat() that catches bad commits at the exact iteration that introduced them, and VERBOSE_DEBUG dumps of aux selection, patterns, and the H AIG. Both fuzzers (fuzz_synth.py, fuzz_unate_def_rep.py) now also randomize --unatedefrepaux. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
run_check only inspected stdout for the CORRECT line and never looked at the process exit code, so segfaults and SIGABRTs from release_assert in test-synth were silently swallowed (especially on non-final AIGs, where the missing-CORRECT branch is skipped entirely). Now any non-zero exit that isn't a clean INCORRECT (returncode 1) is reported as a bug, an INCORRECT match in the output is reported explicitly, and all bug paths print the command and the --seed reproduce line. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Both non-input-leaf modes added in ce2eb75 fail under fuzzing: - aux=2 (undefined to-define leaves) violates the check_pre_post_backward_round_synth invariant. The committed def of `test` reaches a leaf that's neither orig-sampl nor itself defined, tripping the assertion when test-synth loads `*-unsat_unate_def_rep.aig` (seed 1920751642475315400). - aux=1 (backward-defined leaves) breaks Manthan's BW-vars-have-unique- defs assumption. unate_def_rep commits Skolem H, not unique-defining functions: the miter UNSAT only guarantees that *replacing* y_test with H(input, y_aux) keeps F sat, not that y_test = H in every F-sat model. With aux=1, H(input, y_hat_aux) can pick a valid F-sat value of y_test that disagrees with y[test] in the specific F-sat model the CEX produced, leaving y[test] in needs_repair after find_better_ctx. find_next_repair_var then picks a backward_defined var as y_rep and trips the BW assertion at manthan.cpp:1569 (seed 15223131052712053446). Restoring full aux>=1 needs Manthan to support repair on Skolem-defined BW vars. Until then clamp to 0 with a documented stub so the knob keeps parsing for compatibility. After fix: fuzz_synth.py --num 1500, fuzz_unate_def_rep.py 300, fuzz_aig_to_cnf --num 1000, fuzz_aig_rewrite --num 1000 all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit 72688bf.
unate_def_rep with aux>=1 had two soundness gaps that the recent fuzzer (--num 1500) found: 1. The miter UNSAT only proves H is a *Skolem*: F-sat (x, y) implies exists F-sat (x, y') with y'[test] = H(input, y_aux). It does NOT prove uniqueness (F ⊨ y_test = H). When F is bifunctional, F-sat models can have y_test != H. Manthan's BW pipeline assumes y_hat[BW] = y[BW] in every F-sat model (find_better_ctx fixes BW vars by finding F-sat with y[v] = y_hat[v]); a Skolem-only commit can leave a BW var in needs_repair after find_better_ctx, tripping the assertion at find_next_repair_var (manthan.cpp:1569). Fix: after the existing miter UNSAT, run an extra check in f_solver (F-only, no commit clauses) that `F + y_test != H_top` is UNSAT. Only commit when uniqueness holds; otherwise mark Skolem-only and continue refining. 2. When H_test happens to be a constant or input-only AIG, get_var_types categorized it as extend_defined (which Manthan treats as input, dropping the y_test = H_test constraint). Subsequent commits whose miter UNSAT relied on the dropped constraint then produced wrong y_hat in Manthan, also tripping the BW assertion. Fix: track unate_def_rep commits in a `skolem_defined_vars` set on SimplifiedCNF (set via new set_def_skolem entry point) and force get_var_types to keep them in backward_synth_defined_vars regardless of structural deps. The set is round-tripped through the AIG dump format so test-synth's check_pre_post_backward_round_synth can also skip its strict invariant for these vars. After fix: fuzz_synth.py --num 1500, fuzz_unate_def_rep.py 300 (rep_fired=25, all verified), fuzz_aig_to_cnf --num 1000, and fuzz_aig_rewrite --num 1000 all pass with default --unatedefrepaux=2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document and self-check unate_def_rep's invariants so a future regression
fails fast at the introduction site, not at a downstream consumer:
- Cheap asserts (always on, per test loop iteration):
* test is a real, non-input, not-yet-tested var with an indicator
* aux candidates are non-input, non-self, distinct, aux_mask agrees
* pattern conflict literals live in input ∪ aux (no leak from the
f_solver core through the filter)
* before commit: H is non-null, target var has no def yet, and
every direct AIG leaf of H is in input ∪ aux
* after commit: defs[test] is set and is_skolem_defined(test) holds
- SLOW_DEBUG checks (heavy, opt-in via constants.h):
* per-commit defs_invariant() in addition to the existing
check_synth_funs_sat() — catches structural breakage (cycles,
dangling deps, bad categorization)
* end-of-pass defs_invariant() so the pass boundary is a clean
verification point even when no commit fired
Add an is_skolem_defined() accessor on SimplifiedCNF for the asserts.
Widen fuzzer corner coverage: --unatedefrepiters now also exercises 0
(inner loop never runs — no commits at all) and 10000 (stress
refinement); --unatedefrepconfl in fuzz_unate_def_rep.py now also
includes 1 (every solver call mostly times out). All other knobs already
covered 0/1/medium/high/very-high. Skolem-only-skipped count is now
printed in the end-of-pass stats line.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two narrow self-checks for the skolem_defined_vars set, placed where
they document the invariant they protect:
- defs_invariant() (always-on, since it's already mostly SLOW_DEBUG-
gated by callers): every entry in skolem_defined_vars points at a
valid, currently-defined, non-orig-sampl var. This catches
set_def_skolem misuse or a copy/move that desyncs the set from
`defs`.
- get_var_types() at the end (SLOW_DEBUG_DO, since this function is
on the hot path): no Skolem-committed var was placed into
extend_defined_vars. The whole point of the Skolem flag is to keep
such vars in backward_synth_defined_vars even when their AIG looks
input-only or is constant — extend-defined gets treated as input by
Manthan, dropping the y_test = H_test commit constraint. If the
`&& !skolem_defined_vars.count(orig)` branch in the categorization
above ever gets removed/broken, this fires immediately rather than
surfacing as a Manthan BW assertion downstream.
Verified with SLOW_DEBUG-enabled build + fuzz_unate_def_rep.py 50.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The uniqueness check at the miter-UNSAT branch was previously run against a static `f_solver` built once from original F. That made it strict relative to what's actually needed: the BW pipeline only requires y_test = H to be implied by the *current* CNF (= F plus all prior commits), not by original F. A later var's H can be Skolem-only in F but uniquely-defining in cumulative F' once prior commits restrict the model space; the static check rejected those chained-Skolem cases and they exhausted the iter budget instead of committing. Fix: after a successful commit on the s-side, mirror the same y_test ⇔ H_top equivalence into f_solver. h_top_in_f and y_test_in_f are still in scope from the uniqueness check above; the Tseitin chain for H is already in f_solver from that call, so the mirror is just two clauses. Subsequent uniqueness checks then operate against cumulative F' without any soundness loss — the new clauses are exactly the def we already verified is implied by F under the F-only check. Also update the now-stale header doc (step 3b said "UNSAT → valid Skolem; commit" but the code does a uniqueness check; step 5 didn't mention the f_solver mirror) and the in-block comment that described f_solver as "no commit clauses". While here, the verb-1 per-var print line drops the redundant orig-var column (already in the SLOW_DEBUG path) and uses COLCYN/COLDEF for the stop_reason, matching the other [unate_def_rep] verb-1 lines. Verified: fuzz_synth 1500, fuzz_unate_def_rep 300 (42/42 reps verified), fuzz_aig_to_cnf 1000, fuzz_aig_rewrite 1000 all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the strict uniqueness check at the miter-UNSAT branch with a
feasibility check, and materialize the committed equivalence into
cnf.get_clauses() so downstream consumers (find_better_ctx_normal in
particular) see the def directly.
Why: on bifunctional benchmarks like factorization12, every var has
multiple valid Skolem witnesses for the same input X, so strict
uniqueness in F (or in cumulative F') can never hold and the previous
pass committed nothing. The looser check accepts a wider class of Hs.
The feasibility check is `F' ∧ (y_test = H_top)` SAT in f_solver. SAT
means H is at least sometimes a valid value for y_test in cumulative F'
— in particular it rules out the vacuous-miter-UNSAT case where no
F'-model has y_test = H at any X (which would silently lose X-
projections on commit). f_solver continues to accumulate every prior
commit's y_v ⇔ H_v so the check is against cumulative F', not original F.
Caveat: feasibility SAT is weaker than full Skolem (∀X. F-sat at X →
∃Y with y_test = H(X)). A pathological F that splits into two disjoint
input regions with disjoint forced y_test values can pass feasibility
but lose an X-projection on commit. The fuzzers did not surface such a
case in 1500 iterations, but the gap is real and a future-work item is
to replace the feasibility check with a CEX-loop Skolem check.
The materialization side: each successful commit's (test, H) gets
enqueued in `deferred_materialize` and processed after the per-test
loop. We can't materialize in the loop because cnf.new_var() would
shift cnf.nVars(), and that offset is the anchor for the Y'-side
encoding in the miter solver `s` (and in encode_h). Deferring is sound
because the materialization is only needed by find_better_ctx_normal,
which runs inside Manthan after synthesis_unate_def_rep returns.
Each AND helper allocated during materialization is committed via
set_def_skolem (not set_def) so:
- get_var_types keeps it in backward_synth_defined even when its
dep-chain is all input (a pure-input H produces such helpers);
extend_defined would let Manthan treat it as input and drop the
helper = AND constraint we just baked in;
- check_pre_post_backward_round_synth's "vars in CNF must be defined
in terms of orig_sampl_vars" invariant explicitly exempts Skolem-
marked vars, which is necessary for chained-Skolem H's whose aux
leaves are themselves non-orig-sampl backward-defined vars.
The Tseitin chain for H is also mirrored into f_solver on commit
(carried over from the previous commit), so the next var's feasibility
check operates against cumulative F' rather than original F — chained-
Skolem commits compose cleanly inside the loop.
Comments updated throughout (header step 3b/5, in-block comments at
the check site, the post-set_def_skolem block, and the f_solver mirror
block) to describe the new flow.
Verified: factorization12 commits 1 def in unate_def_rep (was 0),
synth completes; previously-failing fuzz_synth seed 14130824211102645510
now passes; fuzz_aig_to_cnf 1000, fuzz_aig_rewrite 1000,
fuzz_unate_def_rep 300 (44/44 reps verified), fuzz_synth 1500 all pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Counts wall-clock time and call counts for the major per-iteration phases (encode_h, miter solve, uniqueness encode/solve, F-only solve, pattern build, commit) so we can attribute cost to specific steps. On the slow benchmarks (amba3b5y, sdlx-fixpoint-5, usb-phy) the breakdown shows the miter SAT solve at 83-91% of total time, with f_solve and encode_h each in the 1-7% range — a useful map for further targeted optimization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Trim unate_def_rep timing instrumentation to the 3 SAT calls cpuTime() in the per-iter hot path was eating measurable time itself. Drop time_setup, time_per_test_setup, time_encode_h, time_feas_encode, time_pattern, time_commit. Keep miter/feas/f solve timings (the actual expensive work) plus all the cheap integer counters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fc_int discarded std::from_chars's return value, so a non-numeric argument like --unatedefrepaux stuff silently parsed as 0 instead of throwing. Now check the errc and trailing-character position. fc_double gets a matching invalid_argument message so failures point at the offending input rather than a bare 'stod'.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.