[arXiv]score: 0.44
MRBT Combines Behavior Trees and LLMs for Modular RL Reward Shaping
May 26, 2026
Masking Reward Behavior Trees (MRBT) use LLM-generated, SMT-solver-verified symbolic structures to automate reward shaping and action masking in RL, improving reactivity to subtask failure and generalization across varying task objects.
cs.LG
HOW THIS AFFECTS YOU
●
builderWorth watching as a pipeline for automating RL reward design in compositional robotics or agent tasks without hand-crafting per-object reward functions.
●
researcherMRBT offers a verifiable, modular alternative to purely LLM-based reward shaping with formal correctness guarantees via SMT solving.