top of page

Search


Compute Among the Stars
Google’s Project Suncatcher envisions moving AI computation into orbit, building constellations of solar-powered satellites equipped with TPUs and laser interlinks. By harnessing the Sun’s constant energy and future low-cost launches, the project proposes a scalable, space-based infrastructure for machine learning. It’s a blueprint for computing beyond Earth—where data centers orbit, powered by sunlight instead of fossil grids.

Juan Manuel Ortiz de Zarate
1 hour ago9 min read


AI Can Code, But Can It Engineer?
SWE-Bench Pro marks a turning point in evaluating AI coding agents. Built from complex, real-world software repositories, it reveals that even frontier models like GPT-5 and Claude Opus solve less than 25% of tasks. The benchmark exposes the gap between coding fluency and true engineering ability, redefining how progress toward autonomous software development should be measured.

Juan Manuel Ortiz de Zarate
6 days ago10 min read


The AlphaGo Moment of Neural Architecture Design
ASI-ARCH marks a breakthrough in AI self-innovation: an autonomous system that designs, codes, and validates new neural network architectures without human input. Conducting 1,773 experiments, it discovered 106 state-of-the-art models, revealing a scaling law for scientific discovery. Like AlphaGo’s Move 37, ASI-ARCH exposes principles beyond human intuition, signaling a new era where AI invents AI.

Juan Manuel Ortiz de Zarate
Oct 2910 min read


The Lightning Mind
DeepSeek-V3.2-Exp introduces a new sparse-attention system that lets large language models handle ultra-long contexts efficiently. Using a “lightning indexer” to select only the most relevant tokens, it cuts computation costs while preserving reasoning power. The result is a faster, cheaper, and more cognitively elegant AI that learns what to ignore, bringing machine focus closer to human intelligence.

Juan Manuel Ortiz de Zarate
Oct 229 min read


The Tiny Trick That Tamed Giant Language Models
LoRA transformed how we fine-tune massive AI models like GPT-3. By adding tiny low-rank matrices instead of retraining billions of parameters, it made adaptation faster, cheaper, and accessible to everyone. This article explores the origins, mechanics, and lasting impact of the “small trick” that tamed the giants of artificial intelligence.

Juan Manuel Ortiz de Zarate
Oct 1511 min read


Inference at Scale
This article explores how to optimize large language model inference at scale, detailing techniques such as quantization, pruning, distillation, attention and cache optimization, speculative decoding, and dynamic batching. It explains the architectural bottlenecks, trade-offs, and engineering practices that enable faster, cheaper, and more efficient deployment of LLMs in real-world systems.

Juan Manuel Ortiz de Zarate
Oct 89 min read
bottom of page