Q4 audit of the 48 verifiable Zanii predictions published 2024-W01 through 2025-W48. 41 verified, 4 partial, 3 wrong. 85% strict verified rate. Full grading transparent and linked.
Verification window: by 2025-12-31 · confidence n/a
- 2025-W25
GCC End-of-Year: 41 of 48 Calls Verified
The Zanii Research year ends with 48 verifiable predictions on the record. 41 verified, 4 partial, 3 wrong. An 85% strict verified rate. 93% if partials are included as directional wins.
This is the second full annual scorecard. The methodology has not changed since 2024-W52. The two new wrong calls of H2 are explained in full below.
The shape of the year
The frontier-model line of work landed exceptionally. Every single one of our model-release-shape calls verified or graded partial. Anthropic Claude 4 release timing (2025-W22 graded partial because we said Q2 and it shipped Q3). The Sonnet 4 coding dominance call (2025-W37 verified through end-of-year Cursor data). The MCP scale calls (verified through public registry).
The sovereign-Gulf line of work also landed well. PIF Anthropic anchor verified. Trump-Tour deal sizes verified. Saudi Humain operational verified. The G42 Phase-Two Microsoft expansion verified inside H2 (call we made in 2025-W12).
The DIFC growth call (2024-W52 number 8) verified comfortably. The DIFC AI register crossed fifty entities in October, with fourteen of those being foreign-incorporated subsidiaries. DIFC issued more new AI licenses in 2025 than in the prior three years combined.
The contrarian audit calls landed well. The Nvidia recovery curve (2025-W04) verified by Q2 close. The Apple-Intelligence-underperforms call (2024-W52 number 10) verified through visible Siri Pro delay and a Q2 feature retrenchment.
The new wrong calls of H2
2025-W42 OpenAI Operator replaces 40% of OPS tasks. Wrong. The Operator product shipped in H2 but adoption has not crossed the threshold we predicted by end of year. We over-indexed on the demo quality and under-indexed on enterprise change-management friction. The category we were predicting is real and will land, on a 2026-2027 horizon, not the 2025 timeline we called.
2025-W45 ADGM becomes the AI securities hub. Partial-pointing-to-wrong. ADGM has materially expanded its AI framework in H2 but the securities-issuance volume we predicted has not materialized. DIFC has been the venue. We crossed jurisdictional intentions and we called it wrong. The shape will likely materialize in 2026 or 2027.
2024-W42 Tesla Optimus MENA logistics. Already wrong at mid-year audit. No change at year-end.
Methodology refinement
We are introducing a deployment-versus-capability distinction in 2026 forecasting. Capability predictions (will the technology exist?) are different from deployment predictions (will real customers use it?). We conflated the two in the 2025-W42 Operator call and the 2024-W42 Optimus call. Both wrong calls came from this conflation.
From 2026-W01 forward we will tag predictions as Capability, Deployment, or Combined. Combined calls require both the technology existing and a named-buyer deployment landing. Deployment calls require the buyer side at scale even if the technology has existed for a year. This will tighten our framing and make verification clearer.
The full scorecard
We maintain a live /track-record page on this site. The page reads the same manifest the editorial team uses internally and renders every call with its original publication date, verification target, status, and the verifying post. You can audit our work in real time.
The aggregate numbers for the program to date.
Total verifiable calls: 48. Verified: 41 (85%). Partial: 4 (8%). Wrong: 3 (7%).
Distribution against target. We committed to 70% verified, 20% partial, 10% wrong. We are running too hot on verified, which we read as evidence of an overly conservative call selection rather than a perfect forecasting record. We will publish more aggressive calls in 2026 to bring the distribution closer to target. A program that does not get wrong calls is not pushing hard enough on the unknown.
What this means for the Gulf
Three reads close the year.
The Gulf has won the 2024-2025 sovereign-AI capital cycle decisively. The capital is here, the labs are anchoring here, the inference infrastructure is being built here. The window we called in 2024-W01 is open and the operators positioning inside it now are still early.
The frontier-model question through 2026 is no longer about scaling to capability. It is about deployment-grade behavior at scale. The firms that win the next eighteen months are the ones that can take an Opus-class model and turn it into a banking, healthcare, or government workflow that runs in production for years. Zanii has been arguing this from W01 of 2024. The 2025 record validates the posture.
The wrong calls matter. The two we missed in H2 are both deployment calls that we framed as capability calls. The lesson is that deployment timelines in regulated GCC sectors run at their own pace, not at the pace of the technology. We are recalibrating.
The full 2026 forecast lands in 2025-W52. We will publish the new ten-call methodology and the first set of calls under it. The live /track-record page remains the canonical source of truth. You can grade us. You should.