top of page

Search


Benchmarking AI Across Disciplines
SuperGPQA evaluates LLMs across 285 disciplines with 26,529 questions, testing their reasoning and knowledge beyond traditional fields.

Juan Manuel Ortiz de Zarate
Feb 269 min read


Measuring Intelligence: Key Benchmarks and Metrics for LLMs
A comprehensive review of essential benchmarks and metrics for evaluating Large Language Models, from accuracy to fairness and conversationa

Juan Manuel Ortiz de Zarate
Nov 8, 202410 min read


Optimizing Machine Learning Models
Optimize ML models with Grid Search, Random Search, and Bayesian Optimization. Boost performance, reduce overfitting, and enhance metrics.

Juan Manuel Ortiz de Zarate
Oct 29, 20249 min read
bottom of page