Mar 1, 2026

Domain Randomization: How Much Is Too Much?

Domain randomization has become one of the standard tools for bridging the sim-to-real gap, and for good reason — when it works, it works well. But after applying it across multiple projects, I've learned that it's not a dial you turn to maximum and walk away. Getting the ranges right is a skill in itself, and getting them wrong in either direction costs you weeks.

Under-randomize, and your policy overfits to the simulator. It will look excellent in evaluation, transfer poorly to hardware, and leave you chasing ghosts during field testing. I've been there. The policy "works" right up until you change the lighting in the lab by 20% or swap to a different batch of objects with slightly different surface texture. Small real-world variations that never appeared in training immediately expose the brittleness.

Over-randomize, and the policy learns nothing useful. If your physics parameters span a range that's too wide — friction coefficients that vary by 10x, textures that look nothing like reality, object masses that no real object would have — the policy can't find a consistent strategy. Training loss plateaus, success rates stagnate, and you spend days tuning rewards before realizing the environment itself is the problem.

The approach I've found most reliable is iterative and measurement-driven. Start with a narrow randomization range centered on your best estimate of real-world parameters. Run a small hardware test early — even 20-30 rollouts — to identify which parameters matter most. Then widen those specific ranges while keeping others tight. Domain randomization should be targeted, not uniform.

The other thing nobody mentions: randomization ranges need to be validated against your actual hardware, not against intuition. Measure your real camera noise, your actual friction variance, your gripper compliance. The best randomization range is one grounded in real measurements, not engineering guesswork.

‹ Why I Default to Classical Control Before Touching RL

Reward Shaping Is the Hardest Part of Robot Learning ›