Skip to content

Bug: discovery uses incorrect Intel resource & label names #1104

@eero-t

Description

@eero-t

WVA device discovery and kind simulation use incorrect Intel extended resource & label names, and mismatching product names:

  • intel.com/gpu.product + intel.com/gpu.memory GPU node labels
  • Intel-Gaudi-2-96GB product name for a GPU resource
  • intel.com/gpu GPU resource

Whereas the extended resource names provided by Intel device plugins are actually following:

  • habana.ai/gaudi for Gaudi AI products
  • gpu.intel.com/i915 for legacy GPU products
  • gpu.intel.com/xe for recent GPU products

And node labels they provides are of the form:

  • habana.ai/product.name
  • gpu.intel.com/product

There's currently no automatic node labeling for device memory amounts, so those would need to be set manually, using the same pattern as rest of their labels:

  • habana.ai/device.memory
  • gpu.intel.com/memory

References:

(Faulty info seems to originate at least partially from #580?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a triage label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions