Putting RL back in RLHF

Posted from: this blog via Microsoft Power Automate.