top of page

Search


Benchmarking AI Across Disciplines
SuperGPQA evaluates LLMs across 285 disciplines with 26,529 questions, testing their reasoning and knowledge beyond traditional fields.

Juan Manuel Ortiz de Zarate
Feb 269 min read


AI That Thinks Before It Speaks
Optimizing AI reasoning with adaptive test-time computation using recurrent depth transformers for smarter, efficient problem-solving.

Juan Manuel Ortiz de Zarate
Feb 199 min read


AI, enhancer or threat?
AI is not just replacing jobs; it's empowering 10x professionals, and amplifying their impact in marketing, recruitment, and beyond.

Juan Manuel Ortiz de Zarate
Feb 139 min read


Saving AI from Itself: How to Prevent Model Collapse
Active Inheritance curates synthetic data to control LLM behavior, preventing AI model collapse and improving diversity, safety, and bias.

Juan Manuel Ortiz de Zarate
Feb 68 min read


DeepSeek, the game-changing model
DeepSeek R1 enhances AI reasoning with reinforcement learning and distillation, achieving top-tier performance while maintaining efficiency

Juan Manuel Ortiz de Zarate
Jan 319 min read


Measuring Intelligence: Key Benchmarks and Metrics for LLMs
A comprehensive review of essential benchmarks and metrics for evaluating Large Language Models, from accuracy to fairness and conversationa

Juan Manuel Ortiz de Zarate
Nov 8, 202410 min read
bottom of page