Neural Networks | Transcendent AI

top of page

Search

The Illusion of Thinking: Understanding Reasoning Models in AI

The Illusion of Thinking: Understanding Reasoning Models in AI

The Illusion of Thinking: Understanding Reasoning Models in AI

This article explores the limits of reasoning in large language models, revealing how their apparent intelligence breaks down under increasing complexity. Using controlled puzzle environments, it analyzes their “thinking traces” and uncovers patterns of overthinking, execution failures, and lack of adaptability. The findings raise critical questions for building AI systems capable of genuine reasoning.

Juan Manuel Ortiz de Zarate

Jun 2610 min read

The Architecture That Redefined AI

The Architecture That Redefined AI

The Architecture That Redefined AI

This article offers a deep dive into the seminal paper Attention Is All You Need, which introduced the Transformer architecture. It explores the limitations of recurrent models, the mechanics of self-attention, training strategies, and the Transformer’s groundbreaking performance on machine translation tasks. The article also highlights the architecture’s enduring legacy as the foundation for modern NLP systems like BERT and GPT.

Juan Manuel Ortiz de Zarate

May 279 min read

Training Harmless AI at Scale

Training Harmless AI at Scale

Training Harmless AI at Scale

This article explores Constitutional AI, a framework developed by Anthropic to train AI systems that are helpful, harmless, and non-evasive—without relying on human labels for harmfulness. By guiding models through critique–revision loops and reinforcement learning from AI-generated feedback, this method offers a scalable, transparent alternative to RLHF and advances the field of AI alignment and self-supervised safety

Juan Manuel Ortiz de Zarate

May 911 min read

The Power of Convolutional Neural Networks

The Power of Convolutional Neural Networks

The Power of Convolutional Neural Networks

Convolutional Neural Networks have revolutionized artificial intelligence by enabling machines to process visual data with remarkable accuracy. Inspired by the visual cortex, CNNs evolved from early models like LeNet-5 to powerful architectures such as AlexNet, VGG, ResNet, and DenseNet. This article explores CNN core concepts, key innovations, real-world applications, and future trends, highlighting their enduring impact on AI.

Juan Manuel Ortiz de Zarate

Apr 2610 min read

Bringing Foundation Models to Small Data

Bringing Foundation Models to Small Data

Bringing Foundation Models to Small Data

This article explores TabPFN, a transformer-based foundation model designed for small tabular datasets. Trained on millions of synthetic datasets generated via structural causal models, TabPFN learns to predict labels through in-context learning. It outperforms traditional methods like CatBoost and XGBoost in both speed and accuracy, while offering robustness, interpretability, and fine-tuning capabilities. A breakthrough in tabular ML, it redefines what's possible on structu

Juan Manuel Ortiz de Zarate

Apr 1111 min read

Diffusion Models: From Noise to Masterpiece

Diffusion Models: From Noise to Masterpiece

Diffusion Models: From Noise to Masterpiece

Explore how Diffusion Models are revolutionizing generative AI, from their mathematical foundations to applications in image and audio.

Juan Manuel Ortiz de Zarate

Mar 208 min read

The Brains Behind AI’s Evolution

The Brains Behind AI’s Evolution

The Brains Behind AI’s Evolution

Discover how neural networks power modern AI, from deep learning to generative models, shaping the future of technology and innovation.

Juan Manuel Ortiz de Zarate

Mar 149 min read

Diffusion LLM: Closer to Human Thought

Diffusion LLM: Closer to Human Thought

Diffusion LLM: Closer to Human Thought

SEDD redefines generative AI with human-like reasoning, enabling faster, high-quality text and code through discrete diffusion models.

Juan Manuel Ortiz de Zarate

Mar 79 min read

AI That Thinks Before It Speaks

AI That Thinks Before It Speaks

AI That Thinks Before It Speaks

Optimizing AI reasoning with adaptive test-time computation using recurrent depth transformers for smarter, efficient problem-solving.

Juan Manuel Ortiz de Zarate

Feb 199 min read

Saving AI from Itself: How to Prevent Model Collapse

Saving AI from Itself: How to Prevent Model Collapse

Saving AI from Itself: How to Prevent Model Collapse

Active Inheritance curates synthetic data to control LLM behavior, preventing AI model collapse and improving diversity, safety, and bias.

Juan Manuel Ortiz de Zarate

Feb 68 min read

The fundamental weapon against overfitting

The fundamental weapon against overfitting

The fundamental weapon against overfitting

A detailed guide on regularization techniques (L1, L2, Elastic Net, Dropout, Early Stopping) to prevent overfitting in machine learning mode

Juan Manuel Ortiz de Zarate

Oct 16, 202410 min read

Orca: The New LLM Teacher

Orca: The New LLM Teacher

Orca: The New LLM Teacher

Orca 2: A smaller AI model that rivals larger ones by mastering task-specific reasoning, achieving high performance with less computation.

Juan Manuel Ortiz de Zarate

Oct 9, 20249 min read

Harnessing the Power of Bagging in Ensemble Learning

Harnessing the Power of Bagging in Ensemble Learning

Harnessing the Power of Bagging in Ensemble Learning

Boost your model's accuracy with bagging! Learn how ensemble techniques can stabilize predictions and improve performance.

Juan Manuel Ortiz de Zarate

Aug 7, 202410 min read

Biases in LLMs

Biases in LLMs

Biases in LLMs

Explore the hidden biases in LLMs and their impact. The opinions of which sector of society are reflected in them?

Juan Manuel Ortiz de Zarate

Jul 17, 202410 min read

Retrieval Augmented Generation: Increasing knowledge of your LLM

Retrieval Augmented Generation: Increasing knowledge of your LLM

Retrieval Augmented Generation: Increasing knowledge of your LLM

Dive into the world of Retrieval-Augmented Generation! See how RAG transforms AI responses by blending retrieval with generation.

Juan Manuel Ortiz de Zarate

May 25, 20249 min read

The Mathematics of Language

The Mathematics of Language

The Mathematics of Language

Computers model text with vectors. Using Word2Vec, FastText, and Transformers, they understand and generate context-aware text. Learn how!

Juan Manuel Ortiz de Zarate

May 25, 20249 min read

MLFlow + Hydra: A Framework for Experimentation with Python

MLFlow + Hydra: A Framework for Experimentation with Python

MLFlow + Hydra: A Framework for Experimentation with Python

In this article I share a experimentation framework I work with in my daily job. It uses MLFlow and Hydra to facilitate hypothesis testing.

Cristian Cardellino

May 23, 202410 min read

The Age of Digital Deception: The Dangers of Deep Fakes

The Age of Digital Deception: The Dangers of Deep Fakes

The Age of Digital Deception: The Dangers of Deep Fakes

Recent years have birthed a dangerous usage of the technology offered by neural networks, the deep fakes. We'll explore them in this article

Cristian Cardellino

Apr 27, 202413 min read

Generative Adversarial Networks (GANs): A Comprehensive Exploration

Generative Adversarial Networks (GANs): A Comprehensive Exploration

Generative Adversarial Networks (GANs): A Comprehensive Exploration

Let's explore Generative Adversarial Networks (GANs), one of the pioneers of Gen AI that was capable of producing photorealistic images.

Cristian Cardellino

Apr 10, 202411 min read

A Brief Introduction to Mixtures-of-Experts

A Brief Introduction to Mixtures-of-Experts

A Brief Introduction to Mixtures-of-Experts

In this article, we will explore the Mixture-of-Experts (MoE) and discuss the idea behind the gating mechanism used by the Sparse MoE.

Cristian Cardellino

Mar 26, 20248 min read

bottom of page