
LLM Stats
ActiveIndependent AI evaluations lab
This company has previously operated under “ZeroEval”, “CallingBox”. A rename frequently marks a pivot in positioning or product — useful raw material for variant ideas.
About
We build independent and contamination-proof benchmarks that measure real world performance. LLM Stats is the most complete LLM leaderboard. We have the most complete archive of LLM benchmark results and also run independent evaluations that are not the classical ones that are already in the training data of most models. Our mission: become the biggest community dedicated to AI transparency.
Founders · 2
Co-Founder at CallingBox. Previously, I was the founder of LLM-Stats.com (500k MAU), I was an early employee on the LLM Observability team at Datadog. I did undergrad research on Vision Transformers for particle physics and RL for robotics.
Co-Founder @ LLM Stats. Previous founding engineer at Micro building the future of email (backed by a16z), as well as founding engineer at Atrato Pago (W21). Formerly built and scaled Minecraft servers during my spare time during highschool.
Related startups

The LLM Eval and Observability Platform for AI Quality

Frontier models for critical domains



