Skip to content

Features Overview

πŸš‡ Environment Features

Multi-Agent Simulation

  • Boarding agents πŸšΆβ€β™‚οΈ start in the platform area and navigate to the tram door
  • Exiting agents πŸšΆβ€β™€οΈ start inside the tram and navigate to the exit
  • Dynamic agent management with configurable agent counts
  • Individual agent tracking with unique identifiers

Basic Collision Handling

  • πŸ›‘οΈ Same-cell prevention only agents are not allowed to occupy the same grid cell
  • No yielding/coordination agents do not respect each other explicitly; they can bump and block

Configurable Geometry

  • πŸ—οΈ Customizable tram size adjustable width, length, and position
  • Flexible door positioning configurable door location and width
  • Environment scaling variable grid dimensions
  • Division line customization tram/waiting area boundary

Ray RLlib Compatibility

  • πŸš€ MultiAgentEnv API compatible with Ray RLlib interfaces used in examples/tests
  • Standard gym interface follows OpenAI Gym conventions
  • Action space support discrete action spaces for all agents
  • Observation space structured observations for each agent

Rendering Modes

  • 🎨 RGB visualization grid-based rendering for images
  • ASCII rendering text-based visualization for terminals
  • Simple coloring different colors for different agent types
  • Step-by-step updates render after environment steps

🎁 Reward System Features

Flexible Reward Strategies

  • Default, simple distance, binary, and custom strategies
  • Configured via RewardConfig classes; parameters depend on the chosen strategy

⏹️ Termination System Features

Configurable Termination Conditions

  • All at destination, individual at destination, and custom policies
  • Set via TerminatedConfig classes; keep it simple or extend as needed

⏱️ Truncation System Features

Flexible Truncation Policies

  • Max steps and custom truncation policies
  • Controlled via TruncatedConfig classes

πŸ‘οΈ Observation System Features

Configurable Observation Functions

  • Agent positions each agent observes its own and other agents' positions
  • Tram door information door boundaries and division line
  • Environment geometry grid dimensions and tram parameters
  • Gym-style spaces observation spaces provided per-agent
  • Custom strategies can be implemented via new observation configs/functions

βš™οΈ Configuration Features

Type-Safe Configuration

  • πŸ”’ Pydantic v2 integration runtime validation of configuration data
  • Automatic validation errors raised at model construction time
  • Immutable configurations frozen after creation
  • IDE support full autocomplete and type hints

Comprehensive Validation

  • Tram parameter validation ensures logical tram dimensions
  • Boundary checking validates all coordinates within grid
  • Agent count limits reasonable limits for performance
  • Render mode validation ensures valid rendering options
  • Parameter validation validates reward, termination, truncation, and observation parameters

Clear Error Messages

  • πŸ’¬ Descriptive validation failures helpful error messages
  • Context-aware errors specific to the validation failure
  • Debugging support detailed error information
  • User-friendly messages easy to understand and fix

Flexible Configuration

  • Default values sensible defaults for common use cases
  • Optional parameters only specify what you need
  • Configuration inheritance extend existing configurations
  • Environment-specific configs different configs for different scenarios
  • Modular configuration separate configs for rewards, termination, truncation, observations

RLlib Integration

πŸ—οΈ Architecture Features

Modular Design

  • 🧩 Separated concerns distinct modules for different functionality
  • Clean interfaces well-defined public APIs
  • Loose coupling minimal dependencies between modules
  • Extensible design easy to add new features

Private Encapsulation

  • πŸ” Proper encapsulation private members where appropriate
  • Public properties clean external interfaces
  • Internal state management controlled access to internal data
  • API stability stable public interfaces

Environment Extensions

  • 🎁 Extensible configuration system modify environment behavior
  • Custom functions implement custom reward, termination, truncation, and observation logic

Performance Considerations

  • Reasonable execution speed suitable for small to medium grid sizes
  • Straightforward Python clarity prioritized over micro-optimizations
  • No heavy vectorization simple loops over agents and grid

🎯 Key Capabilities

Training Support

  • Episode management proper episode termination and truncation
  • Step counting track episode progress
  • Seed management reproducible environments
  • Flexible systems multiple reward, termination, truncation, and observation strategies

System Architecture

All systems (Reward, Termination, Truncation, Observation) feature: - Type-safe configurations πŸ”’ Pydantic-based configs - Automatic validation βœ… parameter validation and bounds checking - Extensible design πŸ”§ easy to add new strategies - Performance optimized ⚑ efficient computation and checking