Shoothill AI Trust Index
Live · 10 models trackedAI models change quietly. The one you trusted last month may already be making things up, or quietly ignoring your instructions. Shoothill AI Trust Index keeps watch, every hour, and tells you when something slips. Free for life.
The companies behind them push updates without warning. A model that was rock-solid last week can start hallucinating, ignoring instructions, or quietly getting worse. You'll only find out when a customer notices. Trust Index is the early-warning system.
We measure how often each model invents a fact, especially in medical, legal, and finance questions. So you know the actual rate, not just the vibe.
Multi-step maths, logic, and planning: the kind of thinking your team actually relies on it for. We update the test set as the bar moves.
Catches the silent slips: ignoring formatting rules, breaking persona, drifting off-brief. The kind of regression that quietly breaks AI features in production.
Compares each new score to the model's recent history. Email lands the moment something shifts past a threshold you set.
Tasks pulled from real enterprise jobs: drafting emails, extracting data, classifying documents, summarising. Demos look easy; we test the messy ones.
Every score is timestamped and exportable. Pass risk reviews and audits with a paper trail, not just an opinion.
No black-box scoring. Sample tests and the full grading methodology are published; the rest of the test set is kept private so model providers can't train against the exact prompts. Same questions every run, so scores stay comparable as the world moves on.
Every hour, we put each tracked model through the same fixed library of test cases. Bespoke enterprise scenarios, same questions every run, kept private so providers can't train against the exact prompts.
Each answer is checked against the right answer, by rules that don't change between runs. So scores today and last week are directly comparable.
Five categories combine into one Trust Index per model: truthfulness, reasoning, instruction adherence, stability, and business readiness.
Set the limits you care about. We email you the moment a model you watch crosses one.
Trust Index isn't built for ML researchers. It's for the people responsible for whether AI in their organisation can be trusted with real work. Compliance leads, IT, product owners, and operations teams who'd rather know about a regression before a customer does.
You picked a model for client-facing work. Six months later, your auditor asks how you know it still meets policy. Trust Index gives you a dated, exportable record of every score since the day you started watching.
Your team has GPT-5.5 in a production feature. The provider quietly updates the model and it starts ignoring your formatting rules. You see it on your dashboard the next morning, not in a customer support ticket.
Choosing between Claude Opus and Gemini 2.5 Pro for a contract summariser? See months of side-by-side performance on the categories that actually matter for the task.
A drafted reply that's 95% right and 5% invented is the worst kind of mistake. Trust Index tracks hallucination rate per model so you know when to retrain or switch.
The C-suite asks if the firm's AI is working. Trust Index lets you answer with months of independent, dated evidence instead of a vendor's marketing slide.
Bringing AI into the business for the first time? Trust Index gives you a no-vendor, no-spin view of how every major model has performed, week after week.
Shoothill helps businesses work smarter and become more efficient. Since 2006 we've delivered 400+ projects: bespoke software, IT infrastructure, creative and marketing services, managed cyber security. Trust Index is the free benchmark we built along the way.
Copilot, modern workplace, digital transformation. Invest in the right places first.
Sharp creative, smart SEO, print and digital campaigns that actually move the needle.
Custom web apps, mobile apps, and AI tailored to your team's real problems.
Managed IT, cyber security, connectivity. The hard part of keeping things live, handled.
Trust Index is free, forever. Pick the models your team uses, set the alerts you want, and go back to your day. We'll let you know when something changes.
Shoothill helps businesses pick, build, and operate AI that's safe, useful, and commercially viable. Fill this in and we'll get back to you within one working day.