Gate News message, April 27 — Logan Kilpatrick, senior product manager at Google DeepMind and product lead for Google AI Studio, stated on X that every company building AI-based products should establish its own custom benchmarks to measure AI model performance. He described this as a method to make model improvements “disproportionately benefit your company” and urged founders and business leaders to “start tomorrow.”
Most companies currently rely on public leaderboards to select AI models, but these measure general capabilities that often misalign with specific business scenarios. Kilpatrick cited the example of a contract review company most concerned with clause extraction accuracy—a capability absent from public benchmarks, making it impossible to assess model performance on that task. Custom benchmarks offer two key advantages: first, they enable companies to evaluate each model update against their own business tasks and select the model that performs best in their actual use case rather than the highest-ranked model overall; second, they allow companies to share these test sets with model providers, driving continuous optimization in areas that matter to their business.
Kilpatrick noted that companies like Zapier and Sierra are already implementing this approach, stating that “there is a lot of alpha that can be created here.”
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to
Disclaimer.
Related Articles
Gemini Launches Agentic Trading for AI-Managed Crypto Accounts
Agentic Trading Launch
Gemini rolled out Agentic Trading on Monday, a feature that allows users to connect AI models like Claude and ChatGPT directly to their trading accounts, according to an announcement shared with The Block. The feature enables AI to autonomously monitor markets, place
CryptoFrontierJust Now
U.S. Department of Defense Adds Gemini Model to AI Portal
Gate News message, April 27 — The U.S. Department of Defense has added Google's Gemini model to its artificial intelligence portal, expanding the tools available for defense technology applications.
GateNews24m ago
Deepfake Call Tricks Cardano Dev, Exposes New Weak Spot
A Cardano developer says a realistic AI deepfake video call led to a laptop breach, a reminder that the next wave of crypto attacks may start with faces and voices rather than smart contracts.
The warning, shared with the Cardano community, describes an incident in which an impostor used
DailyCoin29m ago
Alphea Launches AI-Native Layer 1 Blockchain with Autonomous Agent Execution
Gate News message, April 27 — Alphea, a newly unveiled Layer 1 blockchain platform designed for AI infrastructure, officially presented its decentralized execution environment at Hong Kong Web3 Festival 2026. The platform integrates execution, persistent memory, and verifiable computation as
GateNews47m ago
Ethereum Outperforms S&P 500 by 1,696 Basis Points Since U.S.-Iran Conflict, Says Tom Lee
Gate News message, April 27 — Tom Lee, chairman of Bitmine, stated that Ethereum has outperformed the S&P 500 index (U.S. benchmark equity index) by 1,696 basis points since the U.S.-Iran conflict, making it the best-performing single asset globally aside from crude oil. According to Lee, ETH has de
GateNews54m ago
OpenClaw Releases v2026.4.25 with Major TTS Upgrade and Six New Voice Service Providers
Gate News message, April 27 — OpenClaw released v2026.4.25 according to its official GitHub changelog. The update introduces comprehensive upgrades across voice, plugins, observability, and browser automation modules. The TTS system now supports six new voice service providers: Azure Speech,
GateNews1h ago