MetriLLM is an open-source MCP server designed to benchmark local Large Language Models (LLMs) running on your machine. It exposes tools for testing speed, quality (reasoning, coding, math, instruction-following, more), and hardware fit (RAM/memory efficiency, token/sec, time to first token). Results are shareable to a public leaderboard, and accessible programmatically or via AI assistants (such as Claude, Cursor, Windsurf, or any MCP-compatible client). MetriLLM mainly integrates with local LLM runtimes such as Ollama and LM Studio, and is ideal for LLM developers, AI power users, and anyone optimizing local AI model deployments.
Visit Metrillm's official website for product details and getting started.