Skip to content

feat: artifact-root resolution for external plugins#1264

Open
gwarmstrong wants to merge 2 commits intoNVIDIA-NeMo:mainfrom
gwarmstrong:georgea/gym-benchmark-path-resolution
Open

feat: artifact-root resolution for external plugins#1264
gwarmstrong wants to merge 2 commits intoNVIDIA-NeMo:mainfrom
gwarmstrong:georgea/gym-benchmark-path-resolution

Conversation

@gwarmstrong
Copy link
Copy Markdown
Contributor

feat: artifact-root resolution for external plugins

Summary

Adds three environment variables that let resources servers, agents, model
servers, configs, env.yaml, prompts, rollout inputs, and benchmarks live in
any repo with the conventional <root>/{resources_servers,benchmarks,...}
layout — without forking Gym or merging every iteration upstream.

Env var Purpose Default
NEMO_GYM_EXTRA_ROOTS Colon-separated list of extra roots searched between cwd and the install location. unset
NEMO_GYM_ROOT Overrides the final fallback root. Gym install directory
NEMO_GYM_ALLOW_ROOT_OVERRIDE When set, an artifact in multiple roots resolves to the earliest match instead of raising. unset (strict — collisions raise)

Why

Today every benchmark, resources server, agent, and config has to live in
Gym/{resources_servers,benchmarks,...}/<name>/. Five call sites
(cli.py, global_config.load_extra_config_paths, env.yaml,
rollout_collection.py, prompt.py) implemented partial cwd-fallback in
slightly different ways, and benchmark discovery had no fallback at all —
so a plugin benchmark in another repo was invisible to ng_list_benchmarks
and ng_prepare_benchmark.

This PR replaces those five fallback sites with a single resolve_artifact()
helper, threads the same search list through benchmark discovery, and
exposes the search list via env vars so plugin repos can opt in without
patching Gym.

How it resolves

For every artifact reference Gym searches in this order:

  1. Current working directory
  2. Each entry of NEMO_GYM_EXTRA_ROOTS, in listed order
  3. NEMO_GYM_ROOT (defaults to the install location)

Each root uses the same internal layout:

<root>/
├── resources_servers/<name>/
├── responses_api_models/<name>/
├── responses_api_agents/<name>/
└── benchmarks/<name>/

Each artifact is resolved independently against the same list, so plugins
from different repos compose:

export NEMO_GYM_EXTRA_ROOTS=~/repo-A:~/repo-B

ng_run "+config_paths=[benchmarks/turing_eval/config.yaml]"
# benchmarks/turing_eval/config.yaml      → ~/repo-A
# resources_servers/algebra_judge/...     → ~/repo-B
# responses_api_models/vllm_model/...     → Gym install (final fallback)

Strict by default; opt-in override

When the same artifact (or the same benchmark name) is defined in more than
one root, Gym raises ArtifactCollisionError and names every conflicting
path:

nemo_gym.ArtifactCollisionError: Artifact 'resources_servers/my_judge' is defined in multiple roots:
  /home/user/my-plugins/resources_servers/my_judge
  /opt/Gym/resources_servers/my_judge
Set NEMO_GYM_ALLOW_ROOT_OVERRIDE=1 to use the earliest
(cwd > NEMO_GYM_EXTRA_ROOTS > NEMO_GYM_ROOT), or remove the duplicate.

This catches accidental shadowing (for example, a stale local copy of a
server renamed upstream). When you want a plugin to take precedence
(testing a local fork in place of a shipped server), set
NEMO_GYM_ALLOW_ROOT_OVERRIDE=1 and earliest-wins precedence kicks in.

Roots that resolve to the same canonical filesystem path (for example,
running ng_* from inside the Gym tree where cwd == NEMO_GYM_ROOT, or a
symlinked extra root pointing at Gym) are deduped before the collision
check, so they never false-positive.

Backward compatibility

With no env vars set, get_artifact_roots() returns [cwd, PARENT_DIR]
and the realpath dedup means cwd-equals-Gym (the in-repo case) sees a
single root. Behavior is byte-identical to the pre-Phase-1 cwd-then-PARENT_DIR
fallback that already existed at four of the five call sites. The collision
check is a new protection that only fires when extra roots are
configured, so existing setups are unaffected.

Adds NEMO_GYM_EXTRA_ROOTS, NEMO_GYM_ROOT, and NEMO_GYM_ALLOW_ROOT_OVERRIDE
environment variables so resources servers, agents, model servers, configs,
env.yaml, prompts, rollout inputs, and benchmarks can be hosted in any
plugin repo with the conventional `<root>/{resources_servers,benchmarks,...}`
layout.

A single resolve_artifact() helper unifies the five fallback sites
(cli.py, global_config.load_extra_config_paths, env.yaml, rollout_collection,
prompt.py) and is also used by the new cross-root benchmark discovery in
benchmarks.list_benchmarks. By default, an artifact present in more than
one root raises ArtifactCollisionError; set NEMO_GYM_ALLOW_ROOT_OVERRIDE=1
to fall back to earliest-wins precedence (cwd > extras > NEMO_GYM_ROOT_DIR).
sys.path is augmented so plugin prepare.py modules are importable, with the
same priority order so override mode resolves consistently.

When no env vars are set the resolver collapses to [cwd, PARENT_DIR] and the
realpath-dedup means cwd-equals-Gym (the in-repo case) sees one root,
preserving identical pre-Phase-1 behavior.

Documents the workflow in the Configuration Reference under "Developing
External Artifacts" and adds 56 unit tests covering strict/override modes,
realpath dedup, symlinked roots, validator-filtered collisions, sys.path
ordering, and bool-env-var parsing. 262/262 unit tests pass.

Signed-off-by: gwarmstrong <gwarmstrong@users.noreply.github.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@gwarmstrong gwarmstrong requested review from bxyu-nvidia and cmunley1 and removed request for cmunley1 May 7, 2026 20:20
@cmunley1 cmunley1 requested a review from ananthsub May 7, 2026 22:40

## Developing External Artifacts

Three environment variables control where Gym searches for resources servers, model servers, agents, configs, `env.yaml`, prompts, rollout inputs, and benchmarks. Together they let you develop plugins in a separate repository without forking Gym or merging every iteration upstream.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth clarifying default behavior, which IIRC is:
local install searches Gym/...
pip install checks cwd first, then fallsback to site-packages install location for builtin environment components

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is a slight change here--now the resolution order is always cwd -> extras -> install, independent of a local install or pip install. Which I think is fairly ergonomic and is more consistent with how system references (e.g., putting things on PYTHONPATH) is resolved? I think it could be confusing to deviate from that standard. But if you anticipate major backwards incompatibility implications of this change, we can obviously revert it.

In either case, I'm happy to make the docs more clear once we've decided which route to take.

@cmunley1
Copy link
Copy Markdown
Contributor

cmunley1 commented May 7, 2026

I think that environment abstraction like benchmarks/ but for training too will help this too.

@cmunley1
Copy link
Copy Markdown
Contributor

cmunley1 commented May 7, 2026

we should test pypi wheel on this changes if its not in the latest ci

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants