top of page

Search


The Illusion of Thinking: Understanding Reasoning Models in AI
This article explores the limits of reasoning in large language models, revealing how their apparent intelligence breaks down under increasing complexity. Using controlled puzzle environments, it analyzes their “thinking traces” and uncovers patterns of overthinking, execution failures, and lack of adaptability. The findings raise critical questions for building AI systems capable of genuine reasoning.

Juan Manuel Ortiz de Zarate
Jun 2610 min read


The Architecture That Redefined AI
This article offers a deep dive into the seminal paper Attention Is All You Need, which introduced the Transformer architecture. It explores the limitations of recurrent models, the mechanics of self-attention, training strategies, and the Transformer’s groundbreaking performance on machine translation tasks. The article also highlights the architecture’s enduring legacy as the foundation for modern NLP systems like BERT and GPT.

Juan Manuel Ortiz de Zarate
May 279 min read


Training Harmless AI at Scale
This article explores Constitutional AI, a framework developed by Anthropic to train AI systems that are helpful, harmless, and non-evasive—without relying on human labels for harmfulness. By guiding models through critique–revision loops and reinforcement learning from AI-generated feedback, this method offers a scalable, transparent alternative to RLHF and advances the field of AI alignment and self-supervised safety

Juan Manuel Ortiz de Zarate
May 911 min read


The Power of Convolutional Neural Networks
Convolutional Neural Networks have revolutionized artificial intelligence by enabling machines to process visual data with remarkable accuracy. Inspired by the visual cortex, CNNs evolved from early models like LeNet-5 to powerful architectures such as AlexNet, VGG, ResNet, and DenseNet. This article explores CNN core concepts, key innovations, real-world applications, and future trends, highlighting their enduring impact on AI.

Juan Manuel Ortiz de Zarate
Apr 2610 min read


Bringing Foundation Models to Small Data
This article explores TabPFN, a transformer-based foundation model designed for small tabular datasets. Trained on millions of synthetic datasets generated via structural causal models, TabPFN learns to predict labels through in-context learning. It outperforms traditional methods like CatBoost and XGBoost in both speed and accuracy, while offering robustness, interpretability, and fine-tuning capabilities. A breakthrough in tabular ML, it redefines what's possible on structu

Juan Manuel Ortiz de Zarate
Apr 1111 min read


Diffusion Models: From Noise to Masterpiece
Explore how Diffusion Models are revolutionizing generative AI, from their mathematical foundations to applications in image and audio.

Juan Manuel Ortiz de Zarate
Mar 208 min read


The Brains Behind AI’s Evolution
Discover how neural networks power modern AI, from deep learning to generative models, shaping the future of technology and innovation.

Juan Manuel Ortiz de Zarate
Mar 149 min read


Diffusion LLM: Closer to Human Thought
SEDD redefines generative AI with human-like reasoning, enabling faster, high-quality text and code through discrete diffusion models.

Juan Manuel Ortiz de Zarate
Mar 79 min read


AI That Thinks Before It Speaks
Optimizing AI reasoning with adaptive test-time computation using recurrent depth transformers for smarter, efficient problem-solving.

Juan Manuel Ortiz de Zarate
Feb 199 min read


Saving AI from Itself: How to Prevent Model Collapse
Active Inheritance curates synthetic data to control LLM behavior, preventing AI model collapse and improving diversity, safety, and bias.

Juan Manuel Ortiz de Zarate
Feb 68 min read


The fundamental weapon against overfitting
A detailed guide on regularization techniques (L1, L2, Elastic Net, Dropout, Early Stopping) to prevent overfitting in machine learning mode

Juan Manuel Ortiz de Zarate
Oct 16, 202410 min read


Orca: The New LLM Teacher
Orca 2: A smaller AI model that rivals larger ones by mastering task-specific reasoning, achieving high performance with less computation.

Juan Manuel Ortiz de Zarate
Oct 9, 20249 min read


Harnessing the Power of Bagging in Ensemble Learning
Boost your model's accuracy with bagging! Learn how ensemble techniques can stabilize predictions and improve performance.

Juan Manuel Ortiz de Zarate
Aug 7, 202410 min read


Biases in LLMs
Explore the hidden biases in LLMs and their impact. The opinions of which sector of society are reflected in them?

Juan Manuel Ortiz de Zarate
Jul 17, 202410 min read


Retrieval Augmented Generation: Increasing knowledge of your LLM
Dive into the world of Retrieval-Augmented Generation! See how RAG transforms AI responses by blending retrieval with generation.

Juan Manuel Ortiz de Zarate
May 25, 20249 min read


The Mathematics of Language
Computers model text with vectors. Using Word2Vec, FastText, and Transformers, they understand and generate context-aware text. Learn how!

Juan Manuel Ortiz de Zarate
May 25, 20249 min read


MLFlow + Hydra: A Framework for Experimentation with Python
In this article I share a experimentation framework I work with in my daily job. It uses MLFlow and Hydra to facilitate hypothesis testing.
Cristian Cardellino
May 23, 202410 min read


The Age of Digital Deception: The Dangers of Deep Fakes
Recent years have birthed a dangerous usage of the technology offered by neural networks, the deep fakes. We'll explore them in this article
Cristian Cardellino
Apr 27, 202413 min read


Generative Adversarial Networks (GANs): A Comprehensive Exploration
Let's explore Generative Adversarial Networks (GANs), one of the pioneers of Gen AI that was capable of producing photorealistic images.
Cristian Cardellino
Apr 10, 202411 min read


A Brief Introduction to Mixtures-of-Experts
In this article, we will explore the Mixture-of-Experts (MoE) and discuss the idea behind the gating mechanism used by the Sparse MoE.
Cristian Cardellino
Mar 26, 20248 min read
bottom of page