feat(amdgpu): add AMD GPU ISA support (GFX11/GFX950/GFX12/GFX1250)#286
Open
longknown-amd wants to merge 7 commits intobergercookie:masterfrom
Open
feat(amdgpu): add AMD GPU ISA support (GFX11/GFX950/GFX12/GFX1250)#286longknown-amd wants to merge 7 commits intobergercookie:masterfrom
longknown-amd wants to merge 7 commits intobergercookie:masterfrom
Conversation
Adds Language Server Protocol support for four AMD GPU ISA generations: - amdgpu-gfx11 — RDNA3 (RX 7000 series) and CDNA2/3 (MI200/MI300) - amdgpu-gfx950 — CDNA3.5 (MI350, ISA 9.5.0) - amdgpu-gfx12 — RDNA4 (RX 9000 series) - amdgpu-gfx1250 — CDNA4 (MI400) ## Data pipeline Instruction data is extracted from LLVM's AMDGPU TableGen definitions via `llvm-tblgen --dump-json`. The new `scripts/gen_amdgpu_docs.py` script converts the raw dump into asm-lsp's JSON/XML formats: - Filters real (non-pseudo, non-codegen-only) instructions - Deduplicates by mnemonic, merges variant encodings into asm_templates - GFX950: merges GFX9/VI base + HasGFX950Insts-predicated instructions - GFX1250: merges GFX12 base + GFX1250-specific CDNA4 additions - Generates one shared register XML (VGPR v0-v255, AGPR a0-a255, SGPR s0-s105, TTMP ttmp0-ttmp15, and special registers) Instruction counts per generation: - GFX11: 1373 unique mnemonics - GFX950: 1478 unique mnemonics (superset of GFX9/VI) - GFX12: 1429 unique mnemonics - GFX1250: 1644 unique mnemonics (superset of GFX12) ## Changes - `asm-lsp/types.rs`: four new `Arch` variants with strum/serde attrs, `setup_registers()` and `setup_instructions()` match arms, Display impl - `asm-lsp/parser.rs`: `populate_amdgpu_instructions(arch, json)` function - `asm_docs_parsing/src/main.rs`: import + match arms for 4 AMD variants - `xtask/src/regnerate_docs.rs`: AMD opcode + register serialization entries - `asm-lsp/config_builder.rs`: ARCH_LIST extended from 11 to 15 entries - `asm-lsp_config_schema.json`: 4 new enum values for `instruction_set` - `asm-lsp/test.rs`: serialization freshness tests for all 4 generations ## Usage ```toml # .asm-lsp.toml [default_config] assembler = "gas" instruction_set = "amdgpu-gfx12" ``` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add scripts/parse_amd_isa_pdf.py that extracts per-instruction description, operation pseudocode, and notes from AMD ISA reference PDFs using pdftotext. Commit the four resulting annotation JSON files (gfx11/gfx950/gfx12/gfx1250, ~1100-1260 instructions each). Update scripts/gen_amdgpu_docs.py to load these annotations and embed them as multi-section markdown in each instruction's `summary` field: <prose description> **Operation:** ``` <pseudocode> ``` **Notes:** <notes text> Regenerate the four amdgpu opcode JSON files and their serialized bincode. Instructions without PDF coverage retain the auto-generated brief summary unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… and cross-PDF fallback Two improvements to the annotation pipeline: 1. MFMA name normalisation in parse_amd_isa_pdf.py: AMD ISA PDFs use names like V_MFMA_F32_16x16x1_4b_F32 (with an optional _Nb lane-multiplier suffix and an underscore before the dtype component) that differ from LLVM tblgen names (v_mfma_f32_16x16x1f32). A regex normalisation step now maps PDF keys to tblgen keys at parse time, so all GFX9-era MFMA variants are correctly annotated. 2. Merged cross-PDF annotation pool in gen_amdgpu_docs.py: All four PDF annotation dicts are merged into a single pool. Each generation first tries its own PDF, then falls back to the pool. This lets GFX950 borrow from the CDNA4 PDF (which documents shared MFMA variants like V_MFMA_F32_16x16x128_F8F6F4 in detail) and vice-versa. Coverage improvement (unannotated %): gfx11: 20% → 15% (-68 instructions) gfx950: 26% → 17% (-132 instructions) gfx12: 12% → 11% (-8 instructions) gfx1250: 22% → 22% (CDNA4 swmmac instructions not detailed in PDFs) 147/147 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AMD markets the Instinct MI350 (gfx950) as a CDNA4 product, not CDNA3.5. The ISA version is 9.5.0 and uses GFX9-family encoding, but the marketing generation is CDNA4. Update all human-readable labels in: - scripts/gen_amdgpu_docs.py (description strings and comments) - docs_store/opcodes/amdgpu-gfx950.json (regenerated) - asm-lsp/serialized/opcodes/amdgpu-gfx950 (regenerated) - AMD_ISA_SUPPORT.md (documentation) The internal identifier "amdgpu-gfx950" and all technical data are unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AMDGPU instructions like v_addc_u32_e64, v_mac_f32_dpp use encoding suffixes (_e32, _e64, _dpp, _sdwa, _dpp8, _dpp16) that don't exist as separate entries in the LLVM TableGen database. When exact lookup fails on an AMDGPU arch, strip the encoding suffix and retry against the base mnemonic. Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
Add 40+ AMDGPU-specific assembler directives sourced from https://llvm.org/docs/AMDGPUUsage.html, covering: - Code object V2: .hsa_code_object_version, .hsa_code_object_isa, .amdgpu_hsa_kernel, .amd_kernel_code_t / .end_amd_kernel_code_t - Code object V3+: .amdgcn_target, .amdhsa_code_object_version, .amdhsa_kernel / .end_amdhsa_kernel and all sub-directives (.amdhsa_next_free_vgpr/sgpr, .amdhsa_group_segment_fixed_size, all float/exception mode knobs, workgroup/system SGPR flags, etc.) - .amdgpu_metadata / .end_amdgpu_metadata - Predefined symbols: .amdgcn.gfx_generation_number, .amdgcn.next_free_vgpr/sgpr, etc. A new internal Assembler::Amdgpu variant keys these directives. Config::is_assembler_enabled and effective_assemblers() automatically include Amdgpu when instruction_set is any amdgpu-* arch — no user config change required (assembler = "gas" still works). Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds hover/completion Language Server support for four AMD GPU ISA generations:
Each generation exposes completion and hover for instructions and registers (VGPR v0–v255, AGPR a0–a255, SGPR s0–s105, TTMP, and special registers like
exec,vcc,m0,scc).Data pipeline
Instruction data is derived from LLVM's AMDGPU TableGen definitions using
llvm-tblgen --dump-json. The newscripts/gen_amdgpu_docs.pyscript converts the raw dump into asm-lsp's JSON/XML formats:isPseudo=0,isCodeGenOnly=0, and non-emptyAsmStringasm_templatesHasGFX950Insts-predicated additionsThe script is included in
scripts/so the data files can be regenerated against any future LLVM version.Usage
Valid
instruction_setvalues:amdgpu-gfx11,amdgpu-gfx950,amdgpu-gfx12,amdgpu-gfx1250.Files changed
asm-lsp/types.rsArchvariants +setup_registers/setup_instructionsmatch arms +Displayimplasm-lsp/parser.rspopulate_amdgpu_instructions(arch, json)functionasm_docs_parsing/src/main.rsxtask/src/regnerate_docs.rsasm-lsp/config_builder.rsARCH_LISTextended from 11 → 15 entriesasm-lsp_config_schema.jsoninstruction_setenum valuesasm-lsp/test.rsscripts/gen_amdgpu_docs.pydocs_store/opcodes/amdgpu-gfx{11,950,12,1250}.jsondocs_store/registers/amdgpu.xmlasm-lsp/serialized/opcodes/amdgpu-gfx*asm-lsp/serialized/registers/amdgpuTest plan
cargo test— 147 tests, 0 failures)v_add_f32,s_mov_b32,global_load_b128withinstruction_set = "amdgpu-gfx12"🤖 Generated with Claude Code