Skip to content

Commit 4a768c5

Browse files
committed
Model servers
Signed-off-by: Frankie Siino <fsiino@nvidia.com>
1 parent 64e54eb commit 4a768c5

1 file changed

Lines changed: 76 additions & 0 deletions

File tree

docs/how-to-faq.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -585,6 +585,82 @@ library_judge_math_simple_agent:
585585
# `jsonl_fpath` - same as above.
586586
# Note: This file must be committed to git.
587587
jsonl_fpath: resources_servers/library_judge_math/data/example.jsonl
588+
589+
# ============================================================================
590+
# MODEL SERVER DEFINITION
591+
# ============================================================================
592+
593+
# `policy_model` is a special Level 1 name that refers to the policy model being trained.
594+
# This is the default model for NeMo Gym agents during training.
595+
# You can also define custom model server instances with unique names (e.g., `judge_model`, `teacher_model`).
596+
policy_model:
597+
# `responses_api_models` indicates this is a model server
598+
responses_api_models:
599+
# The model server type. Common types include:
600+
# - openai_model: For OpenAI-compatible endpoints.
601+
# - azure_openai_model: For Azure OpenAI endpoints.
602+
# - vllm_model: For vLLM-hosted models.
603+
# This maps to the folder `responses_api_models/<model-type>/`
604+
openai_model:
605+
# `entrypoint` specifies the file that contains your model server implementation.
606+
# It is relative to the implementation directory (`responses_api_models/openai_model`).
607+
entrypoint: app.py
608+
609+
# === COMMON PARAMETERS (all model types) ===
610+
611+
# API endpoint URL. Parameter name varies by model type:
612+
# - openai_model: `openai_base_url`
613+
# - azure_openai_model: `openai_base_url`
614+
# - vllm_model: `base_url`
615+
# Examples:
616+
# - "https://api.openai.com/v1" (OpenAI)
617+
# - "https://my-resource.openai.azure.com" (Azure OpenAI)
618+
# - "http://localhost:8000/v1" (self-hosted vLLM)
619+
# - ["http://gpu-1:8000/v1", "http://gpu-2:8000/v1"] (vLLM only: list for load balancing)
620+
openai_base_url: ${policy_base_url}
621+
622+
# API authentication key. Parameter name varies:
623+
# - openai_model: `openai_api_key`
624+
# - azure_openai_model: `openai_api_key`
625+
# - vllm_model: `api_key`
626+
# Should be provided via Hydra variable for security (See Best Practices - Keep Secrets in env.yaml from docs/tutorials/09-configuration-guide.md).
627+
openai_api_key: ${policy_api_key}
628+
629+
# Model identifier. Parameter name varies:
630+
# - openai_model: `openai_model`
631+
# - azure_openai_model: `openai_model`
632+
# - vllm_model: `model`
633+
# Examples:
634+
# - "gpt-4o-2024-11-20" (OpenAI)
635+
# - "my-gpt4-deployment" (Azure OpenAI deployment name)
636+
# - "meta-llama/Llama-3.1-8B-Instruct" (vLLM)
637+
openai_model: ${policy_model_name}
638+
639+
# === AZURE OPENAI SPECIFIC PARAMETERS ===
640+
# Only used by `azure_openai_model` type.
641+
642+
# `default_query` contains Azure-specific query parameters.
643+
# Required for Azure OpenAI to specify the API version.
644+
default_query:
645+
# Azure API version string. Updates frequently with new features.
646+
# Check Azure OpenAI documentation for current supported versions.
647+
api-version: "2024-10-21"
648+
649+
# `num_concurrent_requests` limits parallel requests to Azure OpenAI.
650+
num_concurrent_requests: 8
651+
652+
# === VLLM SPECIFIC PARAMETERS ===
653+
# Only used by `vllm_model` type.
654+
655+
# `return_token_id_information` controls whether to return token IDs and log probs.
656+
# Required for training to calculate token-level rewards.
657+
# Set to true for training, false for inference-only scenarios.
658+
return_token_id_information: false
659+
660+
# `uses_reasoning_parser` controls extraction of reasoning traces from model output.
661+
# Set to true for models that generate reasoning in <think> tags.
662+
# Set to false for standard chat models without explicit reasoning tokens.
663+
uses_reasoning_parser: true
588664
```
589665
590666

0 commit comments

Comments
 (0)