DDS topology is an XML file that describes how to distribute and execute your computational tasks across computing nodes in an HPC environment. It defines the structure, requirements, and relationships between your tasks, making it easy to scale from single-node development to large-scale distributed computing.
DDS topology provides a powerful yet simple language for describing distributed computing workflows:
- Tasks: Individual executable processes that form the basic building blocks
- Properties: Communication channels between tasks using key-value pairs
- Collections: Groups of tasks deployed together on the same physical machine
- Groups: Scalable containers that can multiply collections and tasks
- Requirements: Constraints for task placement (specific nodes, GPUs, etc.)
- Assets: Files or data that tasks need access to
- Variables: Parameterization and reusable values
- Triggers: Automatic actions based on task conditions
Tasks communicate through properties - key-value pairs managed by DDS's propagation engine. When one task sets a property value, DDS automatically propagates it to all other tasks that depend on that property. This enables patterns like:
- Service discovery: A "server" task publishes its host and port
- Data passing: One task writes results that others consume
- Coordination: Tasks signal completion or status changes
Note
Property values are treated as strings (256 chars max) and can contain any data your tasks need to exchange.
- Collections group tasks that should run on the same physical machine (for shared memory, high-bandwidth communication, or local file access)
- Groups provide scaling by multiplying their contents with a factor
n - The main group serves as the entry point and can contain other groups
Here's a simple producer-consumer topology:
<topology name="myTopology">
<!-- Variables for easy parameterization -->
<var name="nWorkers" value="4" />
<var name="appPath" value="$DDS_LOCATION/bin/my-app" />
<!-- Communication properties -->
<property name="dataChannel"/>
<property name="resultChannel"/>
<!-- Producer task -->
<decltask name="producer">
<exe reachable="true">${appPath} --mode=producer</exe>
<properties>
<name access="write">dataChannel</name>
</properties>
</decltask>
<!-- Worker task -->
<decltask name="worker">
<exe reachable="true">${appPath} --mode=worker --id=%taskIndex%</exe>
<properties>
<name access="read">dataChannel</name>
<name access="write">resultChannel</name>
</properties>
</decltask>
<!-- Deploy producer and worker together -->
<declcollection name="workUnit">
<tasks>
<name>producer</name>
<name>worker</name>
</tasks>
</declcollection>
<!-- Scale up with multiple work units -->
<main name="main">
<group name="workers" n="${nWorkers}">
<collection>workUnit</collection>
</group>
</main>
</topology>This creates 4 work units, each containing a producer and worker on the same node, for a total of 8 processes.
Topology files are validated against the XSD schema at $DDS_LOCATION/share/topology.xsd.
A topology example:
<topology name="myTopology">
<var name="appNameVar" value="app1 -l -n --taskIndex %taskIndex% --collectionIndex %collectionIndex%" />
<var name="nofGroups" value="10" />
<property name="property1" />
<property name="property2" />
<declrequirement name="requirement1" type="hostname" value="+.gsi.de"/>
<decltrigger name="trigger1" condition="TaskCrashed" action="RestartTask" arg="5"/>
<decltask name="task1">
<requirements>
<name>requirement1</name>
</requirements>
<exe reachable="true">${appNameVar}</exe>
<env reachable="false">env1</env>
<properties>
<name access="read">property1</name>
<name access="readwrite">property2</name>
</properties>
<triggers>
<name>trigger1</name>
</triggers>
</decltask>
<decltask name="task2">
<exe>app2</exe>
<properties>
<name access="write">property1</name>
</properties>
</decltask>
<declcollection name="collection1">
<requirements>
<name>requirement1</name>
</requirements>
<tasks>
<name>task1</name>
<name>task2</name>
<name>task2</name>
</tasks>
</declcollection>
<declcollection name="collection2">
<tasks>
<name>task1</name>
<name>task1</name>
</tasks>
</declcollection>
<main name="main">
<task>task1</task>
<collection>collection1</collection>
<group name="group1" n="${nofGroups}">
<task>task1</task>
<collection>collection1</collection>
<collection>collection2</collection>
</group>
<group name="group2" n="15">
<collection>collection1</collection>
</group>
</main>
</topology>This example demonstrates a complete workflow with 62 total processes distributed across different groups.
DDS supports powerful variable substitution for parameterization and reusability:
Variables are defined using the <var> tag and referenced with ${variable_name} syntax:
<var name="nWorkers" value="8" />
<var name="appPath" value="$DDS_LOCATION/bin/my-app" />
<var name="logLevel" value="INFO" />
<decltask name="worker">
<exe>${appPath} --workers=${nWorkers} --log=${logLevel}</exe>
</decltask>Variables can reference:
- Environment variables:
$DDS_LOCATION,$HOME, etc. - Other variables:
${nWorkers} - Any string values for parameterization
For scaled deployments, DDS provides special index tags that are replaced at runtime:
%taskIndex%- Zero-based index of the task instance%collectionIndex%- Zero-based index of the collection instance
<decltask name="worker">
<exe>my-app --id=%taskIndex% --group=%collectionIndex%</exe>
</decltask>DDS automatically populates environment variables for each task:
$DDS_TASK_PATH- Full path to the user task, e.g.,main/group1/collection_12/task_3$DDS_GROUP_NAME- ID of the parent group$DDS_COLLECTION_NAME- ID of the parent collection (if any)$DDS_TASK_NAME- ID of the task$DDS_TASK_INDEX- Zero-based index of the task instance$DDS_COLLECTION_INDEX- Zero-based index of the collection instance$DDS_SESSION_ID- DDS session this task belongs to
These can be accessed from within your application code or shell scripts.
Properties enable communication between tasks using DDS's key-value propagation engine. When one task sets a property, DDS automatically propagates it to all dependent tasks.
<property name="serviceEndpoint" scope="global"/>
<property name="workerStatus" scope="collection"/>Attributes:
name(required) - Unique identifier for the propertyscope(optional) - Eitherglobal(default) orcollectionglobal: Property is shared across all tasks in the topologycollection: Property is scoped to tasks within the same collection instance
Tasks specify which properties they use and how:
<decltask name="server">
<properties>
<name access="write">serviceEndpoint</name>
</properties>
</decltask>
<decltask name="client">
<properties>
<name access="read">serviceEndpoint</name>
<name access="readwrite">workerStatus</name>
</properties>
</decltask>Access types:
read- Task can only read the property valuewrite- Task can only write the property valuereadwrite(default) - Task can both read and write
Property values are strings (256 chars max) and can contain any data your tasks need to exchange.
Requirements specify constraints for task placement on computing nodes. They help you ensure tasks run on appropriate hardware or specific machines.
<declrequirement name="login_nodes" type="hostname" value="login[0-9]+\.cluster\.local"/>
<declrequirement name="worker_limit" type="maxinstances" value="4"/>
<declrequirement name="compute_group" type="groupname" value="gpu_partition"/>
<declrequirement name="special_node" type="wnname" value="node001"/>
<declrequirement name="custom_check" type="custom" value="has_infiniband"/>Requirement types:
hostname- Match compute node hostname (supports regex)wnname- Match worker node name from SSH configurationmaxinstances- Limit number of task instances per hostgroupname- Target specific node groups/partitionscustom- Custom requirement evaluation (parsed but ignored by scheduler)
Note: The
gpurequirement type is defined in the topology schema but is not currently implemented in the DDS scheduler. GPU node selection must be handled throughhostname,wnname, orgroupnamepatterns that target GPU-enabled nodes. Custom requirements are parsed but currently ignored during scheduling - they are placeholders for future extensibility.
For hostname and wnname, the value can be a full name or regular expression.
You can submit agents with specific group names and then target them in your topology, enabling fine-grained control over task placement across heterogeneous resources:
# Submit GPU-capable agents with a group tag
dds-submit -r slurm -n 10 --slots 8 --group-name="gpu_workers"
# Submit CPU-only agents with a different tag
dds-submit -r slurm -n 20 --slots 16 --group-name="cpu_workers"
# Submit high-memory agents
dds-submit -r slurm -n 5 --slots 32 --group-name="highmem_workers"Then target specific agent groups in your topology:
<topology name="heterogeneous_workflow">
<!-- Define requirements for different agent groups -->
<declrequirement name="gpu_req" type="groupname" value="gpu_workers"/>
<declrequirement name="cpu_req" type="groupname" value="cpu_workers"/>
<declrequirement name="highmem_req" type="groupname" value="highmem_workers"/>
<!-- GPU-intensive task -->
<decltask name="gpu_task">
<exe>cuda_app --device=gpu</exe>
<requirements>
<name>gpu_req</name> <!-- Will run only on gpu_workers agents -->
</requirements>
</decltask>
<!-- Standard CPU task -->
<decltask name="cpu_task">
<exe>standard_app</exe>
<requirements>
<name>cpu_req</name> <!-- Will run only on cpu_workers agents -->
</requirements>
</decltask>
<!-- Memory-intensive task -->
<decltask name="memory_task">
<exe>bigdata_app --memory=large</exe>
<requirements>
<name>highmem_req</name> <!-- Will run only on highmem_workers agents -->
</requirements>
</decltask>
<main name="main">
<task>gpu_task</task>
<group name="cpu_workers" n="10">
<task>cpu_task</task>
</group>
<group name="memory_workers" n="3">
<task>memory_task</task>
</group>
</main>
</topology>This pattern is particularly useful for:
- Heterogeneous clusters: Mix GPU and CPU nodes in the same workflow
- Resource optimization: Ensure memory-intensive tasks get high-memory nodes
- Cost management: Separate expensive GPU resources from cheaper CPU resources
- Multi-tenant environments: Isolate different user groups or projects
Requirements can be applied to tasks or collections:
<decltask name="compute_task">
<exe>cuda_app</exe>
<requirements>
<name>login_nodes</name>
<name>worker_limit</name>
</requirements>
</decltask>Collection requirements override task requirements when both are specified.
Triggers define automatic actions based on task conditions for fault tolerance and automated recovery.
Important: While triggers can be defined in topology files and are parsed correctly, trigger functionality is not currently implemented in the DDS runtime. Tasks that crash will not be automatically restarted regardless of trigger definitions. This feature exists only in the schema and parsing layer.
<decltrigger name="auto_restart" condition="TaskCrashed" action="RestartTask" arg="3"/>Attributes:
name(required) - Unique identifiercondition(required) - Currently supports:TaskCrashedaction(required) - Currently supports:RestartTaskarg(required) - Action parameter (e.g., number of restart attempts)
<decltask name="critical_service">
<exe>my_service</exe>
<triggers>
<name>auto_restart</name>
</triggers>
</decltask>Note: The trigger will be stored in the topology but will have no runtime effect. For fault tolerance, implement restart logic in your application or use external process supervisors.
Assets allow tasks to access files or data, with DDS handling distribution to compute nodes.
<asset name="config_file" type="inline" visibility="task" value="param1=value1 param2=value2"/>
<asset name="shared_data" type="inline" visibility="global" value="#!/bin/bash export MYVAR=123"/>Attributes:
name(required) - Unique identifiertype(required) - Currently supports:inlinevisibility(required) -task(per-task) orglobal(session-wide)value(required) - Asset content (use HTML entities for special characters)
<decltask name="worker">
<exe>my_app --config=asset1</exe>
<assets>
<name>config_file</name>
</assets>
</decltask>Tasks can access assets by name in their execution environment.
Tasks are the fundamental building blocks - individual executable processes that form your distributed application.
<decltask name="worker">
<exe reachable="true">$DDS_LOCATION/bin/my-app --mode=worker</exe>
<env reachable="false">setup_environment.sh</env>
<properties>
<name access="read">inputData</name>
<name access="write">results</name>
</properties>
<requirements>
<name>gpu_node</name>
</requirements>
<triggers>
<name>auto_restart</name>
</triggers>
<assets>
<name>config_file</name>
</assets>
</decltask><exe>(required) - Path to executable with optional argumentsreachableattribute:trueif executable exists on worker nodes,falseif DDS should deploy it
<env>(optional) - Environment setup script to run before the executablereachableattribute:trueif script exists on worker nodes,falseif DDS should deploy it
<properties>(optional) - List of properties this task uses<requirements>(optional) - List of placement requirements<triggers>(optional) - List of fault tolerance triggers<assets>(optional) - List of files/data this task needs
The <exe> element supports:
- Full paths:
/usr/bin/python3 script.py - Variables:
${appPath} --config=${configFile} - Built-in indices:
my-app --id=%taskIndex% - Complex arguments with quotes:
bash -c "echo 'Hello World'"
DDS automatically parses and handles program arguments at runtime.
Collections group tasks that should be deployed together on the same physical machine. This is useful for:
- High-bandwidth communication between tasks
- Shared memory access
- Local file system dependencies
- Reducing network latency
<declcollection name="worker_unit">
<requirements>
<name>gpu_nodes</name>
</requirements>
<tasks>
<name>data_loader</name>
<name>processor</name>
<name>output_writer</name>
</tasks>
</declcollection>Collections can have their own requirements that override individual task requirements. This ensures the entire collection is placed appropriately.
Tasks within a collection are listed in the order they should be considered, but DDS handles the actual scheduling and execution coordination.
Groups provide scaling by multiplying their contents and serve as organizational containers.
<group name="workers" n="8">
<task>standalone_task</task>
<collection>worker_unit</collection>
<group name="sub_workers" n="2">
<task>nested_task</task>
</group>
</group>The n attribute specifies the multiplication factor. In this example:
- 8 instances of
standalone_task - 8 instances of
worker_unitcollection - 16 instances of
nested_task(8 × 2)
The main group is the entry point for execution:
<main name="main">
<task>init_task</task>
<group name="workers" n="4">
<collection>processing_unit</collection>
</group>
<task>cleanup_task</task>
</main>Only the main group can contain other groups, providing a hierarchical structure for complex topologies.
topology, var, property, declrequirement, decltrigger, decltask, declcollection, task, collection, group, main, exe, env, requirements, properties, name
| Parents | Children | Attributes | Description |
|---|---|---|---|
| no | property, task, collection, main, var | name | Declares a topology. |
Example:
<topology name="myTopology">
[... Definition of tasks,
properties, collections and
groups ...]
</topology>| Parents | Children | Attributes | Description |
|---|---|---|---|
| topology | no | name, value | Declares a variable which can be used inside the topology file as ${variable_name}. |
Example:
<var name="var1" value="value1"/>
<var name="var2" value="value2"/>| Parents | Children | Attributes | Description |
|---|---|---|---|
| topology | no | name | Declares a property. |
<property name="property1"/>
<property name="property2"/>| Parents | Children | Attributes | Description |
|---|---|---|---|
| topology | no | name, type, value | Declares a requirement for tasks and collections. |
<declrequirement name="requirement1" type="hostname" value="+.gsi.de"/>| Parents | Children | Attributes | Description |
|---|---|---|---|
| topology | no | name, condition, action, arg | Declares a task trigger. |
<decltrigger name="trigger1" condition="TaskCrashed" action="RestartTask" arg="5"/>| Parents | Children | Attributes | Description |
|---|---|---|---|
| topology | exe, env, requirements, triggers, properties | name | Declares a task. |
<decltask name="task1">
<exe reachable="true">app1 -l -n</exe>
<env reachable="false">env1</env>
<requirements>
<name>requirement1</name>
</requirement>
<triggers>
<name>trigger1</name>
</triggers>
<properties>
<name access="read">property1</name>
<name access="readwrite">property2</name>
</properties>
</decltask>| Parents | Children | Attributes | Description |
|---|---|---|---|
| topology | task | name | Declares a collection. |
<declcollection name="collection1">
<task>task1</task>
<task>task1</task>
</declcollection>| Parents | Children | Attributes | Description |
|---|---|---|---|
| collection, group | no | no | Specifies the unique ID of the already defined task. |
<task>task1</task>| Parents | Children | Attributes | Description |
|---|---|---|---|
| group | no | no | Specifies the unique ID of the already defined collection. |
<collection>collection1</collection>| Parents | Children | Attributes | Description |
|---|---|---|---|
| main | task, collection | name, n | Declares a group. |
<group name="group1" n="10">
<task>task1</task>
<collection>collection1</collection>
<collection>collection2</collection>
</group>| Parents | Children | Attributes | Description |
|---|---|---|---|
| topology | task, collection, group | name | Declares the main group. |
<main name="main">
<task>task1</task>
<collection>collection1</collection>
<group name="group1" n="10">
<task>task1</task>
<collection>collection1</collection>
<collection>collection2</collection>
</group>
</main>Note
Required.
| Parents | Children | Attributes | Description |
|---|---|---|---|
| decltask | no | reachable | Defines path to the executable or script for the task. |
<exe reachable="true">app1 -l -n</exe>Note
Optional.
| Parents | Children | Attributes | Description |
|---|---|---|---|
| decltask | no | reachable | Defines the path to script that has to be executed prior to main executable. |
<env reachable="false">setEnv.sh</env>Note
Optional.
| Parents | Children | Attributes | Description |
|---|---|---|---|
| decltask, declcollection | name | no | Defines a list of requirements. |
<requirements>
<name>requirement1</name>
<name>requirement2</name>
</requirements>Note
Optional.
| Parents | Children | Attributes | Description |
|---|---|---|---|
| decltask | name | no | Defines a list of dependent properties. |
<properties>
<name>property1</name>
<name>property2</name>
</properties>Note
Required.
| Parents | Children | Attributes | Description |
|---|---|---|---|
| properties | no | access | Defines an ID of the already declared property. |
<name>property1</name>name, reachable, n, access, type, condition, action
| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| yes | - | topology, property, declrequirement, decltask, declcollection, group, main | A string with a minimum length of 1 character. | Defines identifier (ID) for topology, property, requirement, task, collection and group. ID has to be unique within its scope, i.e. ID for tasks has to be unique only for tasks. |
<topology name="myTopology">| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| no | true | exe, env | true | false |
Defines if executable or script is available on the worker node. |
<exe reachable="true">app -l</exe>
<env>env1</env>| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| no | 1 | group | An unsigned integer 32-bit value. Min is 1 | Defines multiplication factor for group. |
<group name="group1" n="10">
<task>task1</task>
<collection>collection1</collection>
<collection>collection2</collection>
</group>| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| no | readwrite |
name | read|write|readwrite |
Defines access type from user task to properties. |
<name access="read">property1</name>| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| yes | - | declrequirement | hostname|wnname|maxinstances|groupname|custom |
Defines the type of the requirement. Note: gpu type exists in schema but is not implemented in runtime; custom type is parsed but ignored during scheduling. |
<declrequirement name="host_req" type="hostname" value="login[0-9]+\.cluster\.local"/>
<declrequirement name="limit_req" type="maxinstances" value="4"/>| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| no | global |
property | global|collection |
Defines property scope: global (across all tasks) or collection (within collection instances). |
<property name="globalData" scope="global"/>
<property name="localState" scope="collection"/>| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| yes | - | asset | task|global |
Defines asset visibility: task (per-task instance) or global (session-wide). |
<asset name="config" type="inline" visibility="task" value="config_data"/>| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| yes | - | decltrigger | TaskCrashed |
Defines trigger condition. |
<decltrigger name="trigger1" condition="TaskCrashed" action="RestartTask" arg="5"/>| Required | Default | Tags | Restrictions | Description |
|---|---|---|---|---|
| yes | - | decltrigger | RestartTask |
Defines trigger action. |
<decltrigger name="trigger1" condition="TaskCrashed" action="RestartTask" arg="5"/>Use Requirements Effectively:
<!-- Target specific node patterns for GPU nodes -->
<declrequirement name="gpu_nodes" type="hostname" value="gpu[0-9]+\.cluster\.local"/>
<!-- Limit concurrent tasks per node to avoid resource contention -->
<declrequirement name="task_limit" type="maxinstances" value="2"/>
<!-- Target specific partitions or node groups -->
<declrequirement name="compute_partition" type="groupname" value="gpu_partition"/>Optimize for Network Topology:
<!-- Group communicating tasks in collections for locality -->
<declcollection name="mpi_unit">
<tasks>
<name>mpi_coordinator</name>
<name>mpi_worker_1</name>
<name>mpi_worker_2</name>
</tasks>
</declcollection>Parameter Sweep Pattern:
<topology name="parameter_sweep">
<var name="nJobs" value="100"/>
<decltask name="sweep_job">
<exe>./simulation --param=%taskIndex%</exe>
</decltask>
<main name="main">
<group name="jobs" n="${nJobs}">
<task>sweep_job</task>
</group>
</main>
</topology>Producer-Consumer Pattern:
<topology name="pipeline">
<property name="workQueue" scope="global"/>
<property name="results" scope="global"/>
<decltask name="producer">
<exe>./producer</exe>
<properties>
<name access="write">workQueue</name>
</properties>
</decltask>
<decltask name="consumer">
<exe>./consumer --id=%taskIndex%</exe>
<properties>
<name access="read">workQueue</name>
<name access="write">results</name>
</properties>
</decltask>
<declcollection name="processing_unit">
<tasks>
<name>producer</name>
<name>consumer</name>
</tasks>
</declcollection>
<main name="main">
<group name="workers" n="8">
<collection>processing_unit</collection>
</group>
</main>
</topology>Current Status of DDS Triggers:
DDS triggers are implemented at the topology library level (parsing, validation, storage) but are not yet supported by the DDS runtime system. The trigger infrastructure is in place and waiting for user demand to justify full implementation. If automatic task restart functionality is important for your use case, please consider contributing to the DDS project or expressing your interest to the development team.
Implement Application-Level Recovery:
<!-- Since DDS triggers aren't implemented, build resilience into your application -->
<decltask name="robust_service">
<exe>./service --retry-on-failure --max-attempts=3</exe>
<requirements>
<name>gpu_nodes</name>
</requirements>
</decltask>Alternative Approaches:
- Use external process supervisors (systemd, supervisor, etc.)
- Implement retry logic within your applications
- Use container orchestrators (Kubernetes) for automatic restart
- Design applications to be crash-resilient
Use Assets for Configuration:
<asset name="app_config" type="inline" visibility="global" value="
# Application Configuration
max_workers=4
timeout=300
log_level=INFO
"/>
<decltask name="app">
<exe>./app --config=app_config</exe>
<assets>
<name>app_config</name>
</assets>
</decltask>Handle Dependencies:
<decltask name="ml_training">
<exe reachable="false">python3 train.py --epochs=100</exe>
<env reachable="true">$HOME/setup_ml_env.sh</env>
<requirements>
<name>gpu_nodes</name>
</requirements>
</decltask>Use Variables for Flexibility:
<topology name="simulation">
<!-- Development: small scale -->
<!-- <var name="scale" value="2"/> -->
<!-- Production: full scale -->
<var name="scale" value="100"/>
<var name="app_path" value="$DDS_LOCATION/bin/simulation"/>
<main name="main">
<group name="workers" n="${scale}">
<task>worker</task>
</group>
</main>
</topology>- Start Small: Begin with
n="1"in groups and increase gradually - Use reachable="false": During development to avoid deployment issues
- Leverage Environment Variables: Access
$DDS_TASK_INDEXand$DDS_TASK_PATHfor debugging - Property Scope: Use
scope="collection"for isolated testing,scope="global"for coordination
- Collection Size: Keep collections to 2-8 tasks for optimal placement
- Property Updates: Minimize frequent property updates as they trigger propagation
- Resource Requirements: Be specific to avoid suboptimal placement
- Index Usage: Prefer
%taskIndex%in executables over environment variables for performance
This documentation reflects the current DDS topology library implementation and capabilities.