Distributed Group Membership (DGM)
DGM is a distributed systems project implementing group membership, failure detection, and gossip-based communication. The system is built in Go and designed to run across multiple virtual machines (VMs). It supports dynamic membership, failure detection using ping/ack and gossip heartbeat protocols along with piggyback dissemination of membership information.
- Group Membership: Each VM maintains a membership list and updates it using gossip/pingack protocols.
- Failure Detection: Nodes periodically ping/gossip to each other and use suspicion mechanisms to detect failures.
- Gossip/PingACK Protocol: Membership and state changes are propagated efficiently using piggy-back communication.
- Configurable: System parameters are controlled via a central
config.jsonfile and also from command-line flags.
src/ # Source code
pkg/
memberd/ # Main membership daemon
protocol/ # Gossip and ping/ack protocols
store/ # Membership store
suspicion/ # Failure suspicion manager
transport/ # UDP transport layer
utils/ # Utility functions
wire/ # Wire protocol and encoding
go.mod # Go module file
setup/ # Scripts for VM setup and orchestration
setup.bash # Local setup script
vm_setup.bash # VM setup automation
start_server.bash # Start membership server
kill.bash # Kill all running processes
config.json # System configuration file
README.md # Project documentation
To get started with DGM-G33, follow these steps:
git clone <your-repo-url>
cd dgm-g33
- Go 1.18+
- Bash (for setup scripts)
- Access to multiple VMs or containers (recommended for distributed testing)
cd src
# Install Go dependencies
go mod tidy
Use the provided scripts in the setup/ directory to initialize and start the system:
# Setup VMs and environment
bash setup/vm_setup.bash
# Start the server
bash setup/start_server.bash
On each VM, run the setup script to install dependencies and clone the repo:
cd setup
bash setup.bash
Or, to automate setup and startup across all VMs from one machine:
bash vm_setup.bash setup # Setup all VMs
bash vm_setup.bash start # Start servers on all VMs
If the automation scripts were run, the membership daemon will already be running on each VM. You can attach to the session (if using tmux):
sudo tmux -S /tmp/tmux-cs-425-mp2.sock attach-session -t cs-425-shared-mp2
Or run directly:
cd src/pkg/memberd
--> if introducer:
sudo tmux -S /tmp/tmux-cs-425-mp2.sock new-session -d -s cs-425-shared-mp2 "cd /home/mp2/dgm-g33/src/pkg/memberd && go run main.go -is-introducer"
--> else:
sudo tmux -S /tmp/tmux-cs-425-mp2.sock new-session -d -s cs-425-shared-mp2 "cd /home/mp2/dgm-g33/src/pkg/memberd && go run main.go -introducer=fa25-cs425-3301.cs.illinois.edu:6000"
Membership and failure events are logged and can be monitored via the daemon output. Configuration is managed via config.json and command-line flags.
- Configuration is managed via
config.json. - Main entry point is in
src/pkg/memberd/main.go. - Membership, gossip, and failure detection logic are in
src/pkg/protocol/,src/pkg/store/, andsrc/pkg/suspicion/.
- 5000: Application port (UDP). Each node runs its main membership and gossip protocol on this port.
- 6000: Introducer port (TCP). Used for new nodes joining the group; the introducer listens here for join requests.
- 8080: HTTP server port (TCP). Used for changing configuration parameters at runtime via HTTP requests.
The HTTP server (port 8080) exposes several endpoints for monitoring and controlling the distributed group membership system. Example usage with curl:
--> Example VM address used here: http://fa25-cs425-3301.cs.illinois.edu:8080
- Get current configuration:
curl http://fa25-cs425-3301.cs.illinois.edu:8080/get | python -m json.tool --sort-keys - Set configuration (from deployed config.json file):
curl http://fa25-cs425-3301.cs.illinois.edu:8080/set | python -m json.tool --sort-keys - List all members:
curl http://fa25-cs425-3301.cs.illinois.edu:8080/list_mem | python -m json.tool --sort-keys - List self information:
curl http://fa25-cs425-3301.cs.illinois.edu:8080/list_self | python -m json.tool --sort-keys - Display suspected failed nodes:
curl http://fa25-cs425-3301.cs.illinois.edu:8080/display_suspects | python -m json.tool --sort-keys - Display protocol information:
curl http://fa25-cs425-3301.cs.illinois.edu:8080/display_protocol | python -m json.tool --sort-keys - Leave the group:
curl http://fa25-cs425-3301.cs.illinois.edu:8080/leave | python -m json.tool --sort-keys - Switch protocol/suspicion settings:
To switch to pingack protocol with suspicion disabled:
To switch to gossip protocol with suspicion disabled:
curl -s -X POST http://fa25-cs425-3301.cs.illinois.edu:8080/switch -d '{"protocol":"pingack","suspicion":"disabled"}' -H 'content-type: application/json' | python -m json.tool --sort-keys
To enable suspicion with a custom suspicion time (e.g., 1s) for pingack protocol:curl -s -X POST http://fa25-cs425-3301.cs.illinois.edu:8080/switch -d '{"protocol":"gossip","suspicion":"disabled"}' -H 'content-type: application/json' | python -m json.tool --sort-keys
To enable suspicion with a custom suspicion time (e.g., 1s) for gossip protocol:curl -s -X POST http://fa25-cs425-3301.cs.illinois.edu:8080/switch -d '{"protocol":"pingack","suspicion":"enabled","suspicion_time":"1s"}' -H 'content-type: application/json' | python -m json.tool --sort-keys
curl -s -X POST http://fa25-cs425-3301.cs.illinois.edu:8080/switch -d '{"protocol":"gossip","suspicion":"enabled","suspicion_time":"1s"}' -H 'content-type: application/json' | python -m json.tool --sort-keys
These endpoints allow you to query and modify the system state, view membership and suspicion lists, and control protocol settings at runtime.
Contributions are welcome!