Global instructions for AI coding agents assisting with Exasim and related scientific computing projects.
This project involves:
• high-order finite element methods • hybridizable discontinuous Galerkin (HDG) discretizations • numerical PDE solvers • GPU acceleration (CUDA / HIP / Kokkos) • C++ HPC software
Agents must prioritize scientific correctness and numerical stability.
- Correctness
Never trade correctness for performance.
Numerical methods must preserve: • conservation • consistency • stability • dimensional correctness
If uncertain about a change that may affect correctness, the agent must explicitly state the uncertainty.
- Explicit Assumptions
All reasoning must clearly state:
• mathematical assumptions • data layout assumptions • memory ownership assumptions • solver state assumptions
- GPU-aware implementation
GPU code must consider:
• memory bandwidth • memory coalescing • register pressure • warp divergence • kernel launch overhead • host-device transfers
Prefer designs that maximize arithmetic intensity and minimize global memory traffic.
- Minimal invasive changes
Avoid refactoring unrelated code.
Changes must:
• preserve existing interfaces whenever possible • avoid altering solver structure • minimize code footprint • maintain backward compatibility
- Clear verification steps
Every proposed implementation must include:
• edge case analysis • numerical validation plan • regression tests • performance sanity checks
The codebase implements HDG methods.
Key characteristics:
• element-local solves • static condensation • global trace system • high-order polynomial basis
When modifying algorithms, maintain these structural properties.
The HDG solve typically involves:
- element-local residual and Jacobian assembly
- static condensation eliminating interior DOFs
- global trace system solve
- recovery of element solutions
Agents must not break this structure.
Follow Exasim conventions:
• arrays are typically column-major • indexing is zero-based • flattened arrays are common • batched operations are preferred
Avoid introducing new data layouts unless necessary.
When implementing GPU kernels:
Prefer:
• batched dense linear algebra • BLAS operations (cuBLAS / hipBLAS) • data-parallel loops • kernel fusion when beneficial
Avoid:
• excessive kernel launches • uncoalesced memory access • host-device synchronization
Consider using:
• batched GEMM • batched triangular solves • shared memory for small dense blocks
Before modifying solver code, the agent must:
- summarize the proposed design
- list affected files
- explain how the change interacts with the HDG solver
- confirm assumptions about solver state
Only then propose code edits.
All code proposals must include:
- Design summary
- Assumptions
- Affected files
- Implementation approach
- Code diff or patch
- Verification plan
When analyzing performance:
Check:
• arithmetic intensity • memory bandwidth limits • roofline position • kernel launch counts • GPU occupancy
Use profiler outputs (Nsight, rocprof, etc.) when available.
When debugging numerical issues:
Check:
• boundary conditions • flux consistency • Jacobian correctness • conservation violations • NaN or negative density/energy states
Always identify likely root causes before proposing fixes.
When assisting with manuscripts:
• maintain mathematical rigor • clearly distinguish assumptions • avoid unsupported claims • use consistent notation
Prefer structured explanations.
All responses must follow this structure:
- Summary
- Assumptions
- Design reasoning
- Proposed implementation
- Risks or numerical concerns
- Verification plan
If the agent lacks sufficient information to proceed safely:
• state the missing information • propose diagnostic steps • avoid speculative changes
END OF AGENTS.md