1Cademy - The following steps demonstrate that the surrogate objective, which uses importance sampling, is equivalent to the standard on-policy objective in reinforcement learning. Arrange these mathematical steps in the correct logical order to form the complete derivation.

Learn Before

Equivalence of the Surrogate Objective and the On-Policy Objective

Sequence Ordering

The following steps demonstrate that the surrogate objective, which uses importance sampling, is equivalent to the standard on-policy objective in reinforcement learning. Arrange these mathematical steps in the correct logical order to form the complete derivation.

Updated 2025-10-04

Contributors are: