JobShopLab: A Modular RL Framework for Job‑Shop Scheduling

Summary

JobShopLab is an open‑source Python framework that lets you model, simulate, and optimize complex job‑shop scheduling problems using reinforcement learning (RL). It provides a Gym‑compliant environment with built‑in support for real‑world constraints—like transport delays, buffer capacities, machine breakdowns, and setup times—while enabling multi‑objective optimization (e.g., makespan, utilization, energy). A quick start is as easy as cloning the repo, installing in “editable” mode, loading a YAML config, and stepping through the environment with any RL library that speaks Gym.

Sim

Core Features

Modular Gym Environment
JobShopLab wraps a discrete event simulator in a Gym interface, so you can plug in any RL agent—PPO, DQN, SAC, you name it—and start training immediately
Real‑World Constraints
Includes transport units for material handling, finite buffers, stochastic machine breakdowns, and per‑operation setup times to mirror industrial shop‑floor complexity.
Multi‑Objective Optimization
Beyond minimizing makespan, you can craft reward functions to balance energy efficiency, machine utilization, and lead‑time adherence .
Pip‑Installable & Editable
Clone the GitHub repo and run pip install -e . for a development install, then iterate on configs, DSL definitions, or reward modules without reinstallation.

Quick‑Start Guide

After installing, initialize and run a random policy in just a few lines:

from jobshoplab import JobShopLabEnv, load_config
from pathlib import Path

# 1. Load a YAML‑based configuration
config = load_config(config_path=Path("data/config/getting_started_config.yaml"))

# 2. Create the Gym environment
env = JobShopLabEnv(config=config)

# 3. Run until all jobs are processed
done = False
obs = env.reset()
while not done:
    action = env.action_space.sample()              # random action
    obs, reward, truncated, terminated, info = env.step(action)
    done = truncated or terminated

# 4. Visualize final schedule
env.render()

All steps above follow the official “Getting Started” example.

Underlying Simulation Model

JobShopLab uses a state‑machine approach to simulate each entity:

Machines: Track current job, remaining processing time, setup timers, and random breakdown events.
Transport Units: Move jobs between machines with configurable delays and capacities.
Buffers: Enforce finite queue lengths, forcing agents to plan ahead or risk deadlock.

This design choice ensures that scheduling decisions occur in discrete events—matching how real production lines operate.

Benchmark Experiments

To validate JobShopLab, the authors trained a PPO agent (using Stable‑Baselines3) across a suite of standard academic instances—with one fixed hyperparameter set for all cases—and compared against classic Priority Dispatch Rules (PDRs) like Shortest Processing Time (SPT) and Most Work Remaining (MWKR). The RL agent consistently outperformed these heuristics on makespan suppression and utilization metrics.

Results

Extending & Customizing

Custom Instances: Drop your own JSSP YAML files into data/instances/ to benchmark new layouts.
Reward Functions: Plug in alternate multi‑objective rewards (e.g. weighted combinations of makespan and energy) via the DSL.
Observation Spaces: Define custom observation spaces.

and much more..

Roadmap & Contribution

JobShopLab is actively maintained and welcomes contributions:

Check out the Contributing Guide for how to get started.