Etichs in IA | Transcendent AI

The Checklist Shortcut to Smarter, Safer AI

This article explores Reinforcement Learning from Checklist Feedback (RLCF), a new approach that replaces reward models with checklists to align large language models. By breaking instructions into clear, verifiable steps, checklists provide richer, more interpretable feedback and consistently improve performance across benchmarks. The piece examines how this shift could make AI more reliable, transparent, and user-aligned.

Juan Manuel Ortiz de Zarate

Sep 4, 202512 min read

Understanding what the ML models have learned

Models could spread bias and discrimination if you don't know what they have learned. Here we show a technique to prevent it.

Juan Manuel Ortiz de Zarate

Aug 2, 202410 min read