Skip to content

feat(amdgpu): add AMD GPU ISA support (GFX11/GFX950/GFX12/GFX1250)#286

Open
longknown-amd wants to merge 7 commits intobergercookie:masterfrom
longknown-amd:feature/amdgpu-isa-support
Open

feat(amdgpu): add AMD GPU ISA support (GFX11/GFX950/GFX12/GFX1250)#286
longknown-amd wants to merge 7 commits intobergercookie:masterfrom
longknown-amd:feature/amdgpu-isa-support

Conversation

@longknown-amd
Copy link
Copy Markdown

Summary

Adds hover/completion Language Server support for four AMD GPU ISA generations:

  • amdgpu-gfx11 — RDNA3 (RX 7000 series) and CDNA2/3 (MI200/MI300): 1,373 mnemonics
  • amdgpu-gfx950 — CDNA3.5 / MI350 (ISA 9.5.0): 1,478 mnemonics
  • amdgpu-gfx12 — RDNA4 (RX 9000 series): 1,429 mnemonics
  • amdgpu-gfx1250 — CDNA4 / MI400: 1,644 mnemonics

Each generation exposes completion and hover for instructions and registers (VGPR v0–v255, AGPR a0–a255, SGPR s0–s105, TTMP, and special registers like exec, vcc, m0, scc).

Data pipeline

Instruction data is derived from LLVM's AMDGPU TableGen definitions using llvm-tblgen --dump-json. The new scripts/gen_amdgpu_docs.py script converts the raw dump into asm-lsp's JSON/XML formats:

  • Filters real (non-pseudo, non-codegen-only) instructions by isPseudo=0, isCodeGenOnly=0, and non-empty AsmString
  • Deduplicates by mnemonic, merges variant encodings into asm_templates
  • GFX950: merges GFX9/VI base instructions + HasGFX950Insts-predicated additions
  • GFX1250: merges GFX12 base + GFX1250-specific CDNA4 instructions (superset)

The script is included in scripts/ so the data files can be regenerated against any future LLVM version.

Usage

# .asm-lsp.toml
[default_config]
assembler = "gas"
instruction_set = "amdgpu-gfx12"

Valid instruction_set values: amdgpu-gfx11, amdgpu-gfx950, amdgpu-gfx12, amdgpu-gfx1250.

Files changed

File Change
asm-lsp/types.rs 4 new Arch variants + setup_registers/setup_instructions match arms + Display impl
asm-lsp/parser.rs New populate_amdgpu_instructions(arch, json) function
asm_docs_parsing/src/main.rs Import + match arms for 4 AMD variants
xtask/src/regnerate_docs.rs AMD opcode + register serialization entries
asm-lsp/config_builder.rs ARCH_LIST extended from 11 → 15 entries
asm-lsp_config_schema.json 4 new instruction_set enum values
asm-lsp/test.rs Serialization freshness tests for all 4 generations
scripts/gen_amdgpu_docs.py New: tblgen JSON → asm-lsp JSON/XML converter
docs_store/opcodes/amdgpu-gfx{11,950,12,1250}.json New: instruction data
docs_store/registers/amdgpu.xml New: shared register definitions
asm-lsp/serialized/opcodes/amdgpu-gfx* New: pre-serialized bincode
asm-lsp/serialized/registers/amdgpu New: pre-serialized bincode

Test plan

  • All existing tests continue to pass (cargo test — 147 tests, 0 failures)
  • 4 new serialization freshness tests added (one per generation)
  • Manually verified hover and completion on v_add_f32, s_mov_b32, global_load_b128 with instruction_set = "amdgpu-gfx12"

🤖 Generated with Claude Code

thomas and others added 7 commits March 27, 2026 13:44
Adds Language Server Protocol support for four AMD GPU ISA generations:

- amdgpu-gfx11  — RDNA3 (RX 7000 series) and CDNA2/3 (MI200/MI300)
- amdgpu-gfx950 — CDNA3.5 (MI350, ISA 9.5.0)
- amdgpu-gfx12  — RDNA4 (RX 9000 series)
- amdgpu-gfx1250 — CDNA4 (MI400)

## Data pipeline

Instruction data is extracted from LLVM's AMDGPU TableGen definitions
via `llvm-tblgen --dump-json`. The new `scripts/gen_amdgpu_docs.py`
script converts the raw dump into asm-lsp's JSON/XML formats:

- Filters real (non-pseudo, non-codegen-only) instructions
- Deduplicates by mnemonic, merges variant encodings into asm_templates
- GFX950: merges GFX9/VI base + HasGFX950Insts-predicated instructions
- GFX1250: merges GFX12 base + GFX1250-specific CDNA4 additions
- Generates one shared register XML (VGPR v0-v255, AGPR a0-a255,
  SGPR s0-s105, TTMP ttmp0-ttmp15, and special registers)

Instruction counts per generation:
- GFX11:   1373 unique mnemonics
- GFX950:  1478 unique mnemonics (superset of GFX9/VI)
- GFX12:   1429 unique mnemonics
- GFX1250: 1644 unique mnemonics (superset of GFX12)

## Changes

- `asm-lsp/types.rs`: four new `Arch` variants with strum/serde attrs,
  `setup_registers()` and `setup_instructions()` match arms, Display impl
- `asm-lsp/parser.rs`: `populate_amdgpu_instructions(arch, json)` function
- `asm_docs_parsing/src/main.rs`: import + match arms for 4 AMD variants
- `xtask/src/regnerate_docs.rs`: AMD opcode + register serialization entries
- `asm-lsp/config_builder.rs`: ARCH_LIST extended from 11 to 15 entries
- `asm-lsp_config_schema.json`: 4 new enum values for `instruction_set`
- `asm-lsp/test.rs`: serialization freshness tests for all 4 generations

## Usage

```toml
# .asm-lsp.toml
[default_config]
assembler = "gas"
instruction_set = "amdgpu-gfx12"
```

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add scripts/parse_amd_isa_pdf.py that extracts per-instruction
description, operation pseudocode, and notes from AMD ISA reference
PDFs using pdftotext. Commit the four resulting annotation JSON files
(gfx11/gfx950/gfx12/gfx1250, ~1100-1260 instructions each).

Update scripts/gen_amdgpu_docs.py to load these annotations and embed
them as multi-section markdown in each instruction's `summary` field:

  <prose description>

  **Operation:**
  ```
  <pseudocode>
  ```

  **Notes:**
  <notes text>

Regenerate the four amdgpu opcode JSON files and their serialized
bincode. Instructions without PDF coverage retain the auto-generated
brief summary unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… and cross-PDF fallback

Two improvements to the annotation pipeline:

1. MFMA name normalisation in parse_amd_isa_pdf.py:
   AMD ISA PDFs use names like V_MFMA_F32_16x16x1_4b_F32 (with an optional
   _Nb lane-multiplier suffix and an underscore before the dtype component)
   that differ from LLVM tblgen names (v_mfma_f32_16x16x1f32). A regex
   normalisation step now maps PDF keys to tblgen keys at parse time, so
   all GFX9-era MFMA variants are correctly annotated.

2. Merged cross-PDF annotation pool in gen_amdgpu_docs.py:
   All four PDF annotation dicts are merged into a single pool. Each
   generation first tries its own PDF, then falls back to the pool. This
   lets GFX950 borrow from the CDNA4 PDF (which documents shared MFMA
   variants like V_MFMA_F32_16x16x128_F8F6F4 in detail) and vice-versa.

Coverage improvement (unannotated %):
  gfx11:   20% → 15%  (-68 instructions)
  gfx950:  26% → 17%  (-132 instructions)
  gfx12:   12% → 11%  (-8 instructions)
  gfx1250: 22% → 22%  (CDNA4 swmmac instructions not detailed in PDFs)

147/147 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AMD markets the Instinct MI350 (gfx950) as a CDNA4 product, not CDNA3.5.
The ISA version is 9.5.0 and uses GFX9-family encoding, but the marketing
generation is CDNA4.

Update all human-readable labels in:
- scripts/gen_amdgpu_docs.py (description strings and comments)
- docs_store/opcodes/amdgpu-gfx950.json (regenerated)
- asm-lsp/serialized/opcodes/amdgpu-gfx950 (regenerated)
- AMD_ISA_SUPPORT.md (documentation)

The internal identifier "amdgpu-gfx950" and all technical data are unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AMDGPU instructions like v_addc_u32_e64, v_mac_f32_dpp use encoding
suffixes (_e32, _e64, _dpp, _sdwa, _dpp8, _dpp16) that don't exist
as separate entries in the LLVM TableGen database. When exact lookup
fails on an AMDGPU arch, strip the encoding suffix and retry against
the base mnemonic.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
Add 40+ AMDGPU-specific assembler directives sourced from
https://llvm.org/docs/AMDGPUUsage.html, covering:

- Code object V2: .hsa_code_object_version, .hsa_code_object_isa,
  .amdgpu_hsa_kernel, .amd_kernel_code_t / .end_amd_kernel_code_t
- Code object V3+: .amdgcn_target, .amdhsa_code_object_version,
  .amdhsa_kernel / .end_amdhsa_kernel and all sub-directives
  (.amdhsa_next_free_vgpr/sgpr, .amdhsa_group_segment_fixed_size,
  all float/exception mode knobs, workgroup/system SGPR flags, etc.)
- .amdgpu_metadata / .end_amdgpu_metadata
- Predefined symbols: .amdgcn.gfx_generation_number,
  .amdgcn.next_free_vgpr/sgpr, etc.

A new internal Assembler::Amdgpu variant keys these directives.
Config::is_assembler_enabled and effective_assemblers() automatically
include Amdgpu when instruction_set is any amdgpu-* arch — no user
config change required (assembler = "gas" still works).

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant