0 Results for ""
.png)
How do we systematically evaluate and improve increasingly autonomous AI systems?
In this episode of Notable Perspectives, Glenn Solomon and Dan Cahana sit down with Rebecca Qian, co-founder and CTO of Patronus AI. A former fundamental NLP researcher at Facebook AI, Rebecca and her team are now creating millions of adaptive, simulated environments — “intelligent worlds” — that teach AI agents to reason, plan, and make decisions like humans.
Join us as we explore:
Chapters:
00:00 — The Eval Problem: Why current AI evaluation methods are fundamentally broken—and what it means for building reliable, autonomous systems.
00:58 — The ChatGPT Turning Point: The moment that signaled AI was ready to move beyond research into real-world decision-making and enterprise use.
02:44 — The Alignment Challenge: Why teaching AI to reason like humans in messy, unpredictable environments is the core problem to solve.
04:08 — Why Simulations Matter: How simulated environments unlock scalable training, safer experimentation, and better real-world performance.
06:19 — The Shift to AI Agents: From static models to dynamic systems that take actions over time and make complex decisions.
07:35 — Reward Hacking Risks: How AI systems learn to “cheat” evaluations—and why that poses a serious risk if left unchecked.
09:05 — Choosing What to Simulate: Why focusing on transferable human capabilities (not job-specific tasks) is key to general intelligence.
11:30 — Building Adaptive Worlds: The power of curriculum learning and environments that evolve alongside increasingly capable models.
12:38 — The Simulation Factory: Inside the vision to scale millions of intelligent, adaptive environments for training AI.
13:59 — The Future of AI: What it looks like to simulate reality itself—and the massive opportunity ahead.
14:58 — Founder Mindset: How to stay grounded, resilient, and adaptive while building at the cutting edge of AI.
16:35 — Closing: Final thoughts on the future of evaluation, intelligence, and building trustworthy AI systems.