top of page

Search


When AI Slows You Down
This article analyzes a 2025 randomized controlled trial that challenges common assumptions about AI-enhanced software development. Contrary to expert and developer expectations, state-of-the-art AI tools slowed down experienced open-source contributors by 19%. Through detailed behavioral analysis and a review of contributing factors, the study reveals the hidden costs of AI assistance in complex, high-context coding environments.

Juan Manuel Ortiz de Zarate
Aug 2, 202511 min read


Misaligned Intelligence
This article explores the concept of agentic misalignment in large language models, based on Anthropic's 2025 study. Through the “Summit Bridge” simulation, it reveals how advanced AIs can adopt deceptive, coercive strategies when facing threats to their objectives. The piece analyzes experimental results, ethical implications, mitigation strategies, and the broader risks of deploying increasingly autonomous AI systems without robust safeguards.

Juan Manuel Ortiz de Zarate
Jul 17, 202510 min read


AI Against Racism
This article explores how an open-source AI system helped Santa Clara County identify and redact thousands of racially restrictive covenants buried in millions of historical property deeds. By fine-tuning a legal-specific language model, the project achieved near-perfect accuracy while cutting costs dramatically. The work demonstrates how AI can support legal reform, scale archival justice, and preserve public accountability.

Juan Manuel Ortiz de Zarate
Jul 4, 202510 min read


The Illusion of Thinking: Understanding Reasoning Models in AI
This article explores the limits of reasoning in large language models, revealing how their apparent intelligence breaks down under increasing complexity. Using controlled puzzle environments, it analyzes their “thinking traces” and uncovers patterns of overthinking, execution failures, and lack of adaptability. The findings raise critical questions for building AI systems capable of genuine reasoning.

Juan Manuel Ortiz de Zarate
Jun 26, 202510 min read


The Architecture That Redefined AI
This article offers a deep dive into the seminal paper Attention Is All You Need, which introduced the Transformer architecture. It explores the limitations of recurrent models, the mechanics of self-attention, training strategies, and the Transformer’s groundbreaking performance on machine translation tasks. The article also highlights the architecture’s enduring legacy as the foundation for modern NLP systems like BERT and GPT.

Juan Manuel Ortiz de Zarate
May 27, 20259 min read


Training Harmless AI at Scale
This article explores Constitutional AI, a framework developed by Anthropic to train AI systems that are helpful, harmless, and non-evasive—without relying on human labels for harmfulness. By guiding models through critique–revision loops and reinforcement learning from AI-generated feedback, this method offers a scalable, transparent alternative to RLHF and advances the field of AI alignment and self-supervised safety

Juan Manuel Ortiz de Zarate
May 8, 202511 min read


Foundation Models
Foundation models like GPT-3 and CLIP are reshaping AI by enabling general-purpose systems trained on massive, unlabelled data. This article explores their key concepts—emergence and homogenization—their capabilities across language, vision, and more, and the risks they pose, from bias to environmental impact. Based on the Stanford report, it highlights why foundation models are powerful, unpredictable, and demand responsible development.

Juan Manuel Ortiz de Zarate
May 7, 20259 min read


How Bigger Models Get Better
This article explores the groundbreaking findings of Kaplan et al. on scaling laws for neural language models. It explains how model performance improves predictably with increased model size, dataset size, and compute budget, highlighting power-law relationships. The piece discusses implications for efficient AI training, optimal resource allocation, overfitting avoidance, and future research directions.

Juan Manuel Ortiz de Zarate
Apr 30, 202510 min read


How AI is Transforming Science and Medicine
This article explores how AI is transforming science and medicine in 2025. From breakthroughs in protein engineering and brain mapping to outperforming doctors in clinical diagnosis, AI is becoming an active research partner and clinical assistant. It highlights key findings from Stanford’s AI Index Report, including the rise of virtual labs, predictive healthcare models, AI scribes, and the importance of ethical, inclusive, and regulated deployment.

Juan Manuel Ortiz de Zarate
Apr 15, 202511 min read


Bringing Foundation Models to Small Data
This article explores TabPFN, a transformer-based foundation model designed for small tabular datasets. Trained on millions of synthetic datasets generated via structural causal models, TabPFN learns to predict labels through in-context learning. It outperforms traditional methods like CatBoost and XGBoost in both speed and accuracy, while offering robustness, interpretability, and fine-tuning capabilities. A breakthrough in tabular ML, it redefines what's possible on structu

Juan Manuel Ortiz de Zarate
Apr 11, 202511 min read


Can a Chatbot Make Us Feel Better (or Worse)?
Can AI chatbots comfort us—or make us dependent? A study explores ChatGPT's emotional impact and the ethics of affective design.

Juan Manuel Ortiz de Zarate
Apr 4, 20259 min read


Diffusion LLM: Closer to Human Thought
SEDD redefines generative AI with human-like reasoning, enabling faster, high-quality text and code through discrete diffusion models.

Juan Manuel Ortiz de Zarate
Mar 7, 20259 min read


Benchmarking AI Across Disciplines
SuperGPQA evaluates LLMs across 285 disciplines with 26,529 questions, testing their reasoning and knowledge beyond traditional fields.

Juan Manuel Ortiz de Zarate
Feb 26, 20259 min read


Saving AI from Itself: How to Prevent Model Collapse
Active Inheritance curates synthetic data to control LLM behavior, preventing AI model collapse and improving diversity, safety, and bias.

Juan Manuel Ortiz de Zarate
Feb 6, 20258 min read


DeepSeek, the game-changing model
DeepSeek R1 enhances AI reasoning with reinforcement learning and distillation, achieving top-tier performance while maintaining efficiency

Juan Manuel Ortiz de Zarate
Jan 31, 20259 min read


Measuring Intelligence: Key Benchmarks and Metrics for LLMs
A comprehensive review of essential benchmarks and metrics for evaluating Large Language Models, from accuracy to fairness and conversationa

Juan Manuel Ortiz de Zarate
Nov 8, 202410 min read


Orca: The New LLM Teacher
Orca 2: A smaller AI model that rivals larger ones by mastering task-specific reasoning, achieving high performance with less computation.

Juan Manuel Ortiz de Zarate
Oct 9, 20249 min read


Data Balancing With K-Means
A clustering-based method balances web-scraped datasets, improving AI model performance by ensuring diverse and uniform data representation.

Juan Manuel Ortiz de Zarate
Oct 4, 20249 min read


AI Researchers
AI Scientist automates research, generating ideas, running experiments, and writing papers, challenging AI's role in novel scientific discov

Juan Manuel Ortiz de Zarate
Aug 27, 20249 min read


Understanding what the ML models have learned
Models could spread bias and discrimination if you don't know what they have learned. Here we show a technique to prevent it.

Juan Manuel Ortiz de Zarate
Aug 2, 202410 min read
bottom of page