Features Overview

🚇 Environment Features

Multi-Agent Simulation

Boarding agents 🚶‍♂️ start in the platform area and navigate to the tram door
Exiting agents 🚶‍♀️ start inside the tram and navigate to the exit
Dynamic agent management with configurable agent counts
Individual agent tracking with unique identifiers

Basic Collision Handling

🛡️ Same-cell prevention only agents are not allowed to occupy the same grid cell
No yielding/coordination agents do not respect each other explicitly; they can bump and block

Configurable Geometry

🏗️ Customizable tram size adjustable width, length, and position
Flexible door positioning configurable door location and width
Environment scaling variable grid dimensions
Division line customization tram/waiting area boundary

Ray RLlib Compatibility

🚀 MultiAgentEnv API compatible with Ray RLlib interfaces used in examples/tests
Standard gym interface follows OpenAI Gym conventions
Action space support discrete action spaces for all agents
Observation space structured observations for each agent

Rendering Modes

🎨 RGB visualization grid-based rendering for images
ASCII rendering text-based visualization for terminals
Simple coloring different colors for different agent types
Step-by-step updates render after environment steps

🎁 Reward System Features

Flexible Reward Strategies

Default, simple distance, binary, and custom strategies
Configured via RewardConfig classes; parameters depend on the chosen strategy

⏹️ Termination System Features

Configurable Termination Conditions

All at destination, individual at destination, and custom policies
Set via TerminatedConfig classes; keep it simple or extend as needed

⏱️ Truncation System Features

Flexible Truncation Policies

Max steps and custom truncation policies
Controlled via TruncatedConfig classes

👁️ Observation System Features

Configurable Observation Functions

Agent positions each agent observes its own and other agents' positions
Tram door information door boundaries and division line
Environment geometry grid dimensions and tram parameters
Gym-style spaces observation spaces provided per-agent
Custom strategies can be implemented via new observation configs/functions

⚙️ Configuration Features

Type-Safe Configuration

🔒 Pydantic v2 integration runtime validation of configuration data
Automatic validation errors raised at model construction time
Immutable configurations frozen after creation
IDE support full autocomplete and type hints

Comprehensive Validation

Tram parameter validation ensures logical tram dimensions
Boundary checking validates all coordinates within grid
Agent count limits reasonable limits for performance
Render mode validation ensures valid rendering options
Parameter validation validates reward, termination, truncation, and observation parameters

Clear Error Messages

💬 Descriptive validation failures helpful error messages
Context-aware errors specific to the validation failure
Debugging support detailed error information
User-friendly messages easy to understand and fix

Flexible Configuration

Default values sensible defaults for common use cases
Optional parameters only specify what you need
Configuration inheritance extend existing configurations
Environment-specific configs different configs for different scenarios
Modular configuration separate configs for rewards, termination, truncation, observations

RLlib Integration

See the RLlib compatibility guide: RLlib MultiAgentEnv Compatibility

🏗️ Architecture Features

Modular Design

🧩 Separated concerns distinct modules for different functionality
Clean interfaces well-defined public APIs
Loose coupling minimal dependencies between modules
Extensible design easy to add new features

Private Encapsulation

🔐 Proper encapsulation private members where appropriate
Public properties clean external interfaces
Internal state management controlled access to internal data
API stability stable public interfaces

Environment Extensions

🎁 Extensible configuration system modify environment behavior
Custom functions implement custom reward, termination, truncation, and observation logic

Performance Considerations

Reasonable execution speed suitable for small to medium grid sizes
Straightforward Python clarity prioritized over micro-optimizations
No heavy vectorization simple loops over agents and grid

🎯 Key Capabilities

Training Support

Episode management proper episode termination and truncation
Step counting track episode progress
Seed management reproducible environments
Flexible systems multiple reward, termination, truncation, and observation strategies

System Architecture

All systems (Reward, Termination, Truncation, Observation) feature: - Type-safe configurations 🔒 Pydantic-based configs - Automatic validation ✅ parameter validation and bounds checking - Extensible design 🔧 easy to add new strategies - Performance optimized ⚡ efficient computation and checking