07 May 2026

AI Radar — 07 May 2026

10 items 5 verified 5 secondary 0 rumor 16 sources 41% exploration

Anthropic’s Code with Claude week: managed agents gain self-improvement capabilities; OpenAI refreshes the default ChatGPT model; Sierra closes a $950M enterprise AI round; and regulators on two continents weigh in on Mythos.

Run: 04–07 May 2026 · 28 items reviewed → 10 published · 5 verified · 5 secondary · 0 rumor · 41% exploration · Run timestamp: 2026-05-07

TL;DR

Claude Managed Agents — three new capabilities: Outcomes (goal-defined iteration), Dreaming (session-based self-improvement), and multi-agent orchestration, in public beta or research preview as of May 6. (→ Claude Managed Agents)
TinyFish free Search/Fetch — web Search and Fetch APIs now free for all AI agents, including MCP-native access with no credit card required. (→ TinyFish)
OpenAI GPT-5.5 Instant — replaces GPT-5.3 Instant as the default ChatGPT model; AIME score rises from 65.4 to 81.2 per vendor benchmarks; rolling out across all plans. (→ GPT-5.5 Instant)
Anthropic + SpaceX Colossus — 220,000+ NVIDIA GPUs added via SpaceX Colossus 1 deal; Claude Code five-hour limits doubled immediately for paid plans. (→ SpaceX Compute Deal)
Mythos dual scrutiny — EU officials seek access to test European banks using Anthropic’s unreleased Mythos model; the White House is weighing a pre-release AI model review working group. (→ Mythos Policy)

Items

Claude Managed Agents adds Dreaming, Outcomes, and multi-agent orchestration

Source: https://claude.com/blog/new-in-claude-managed-agents · Anthropic · 2026-05-06 Verification: T2 verified · announcement · workflow-automation

Anthropic published three additions to Claude Managed Agents on 6 May 2026 at its Code with Claude developer conference. Outcomes (public beta) lets developers write a success rubric; a separate grader model evaluates the agent’s output in its own context window and routes failed outputs back for another attempt — internal tests show up to 10 percentage points of task-success improvement, with per-format gains of 8.4 pp on .docx and 10.1 pp on .pptx (vendor-measured, no independent replication found). Multi-agent orchestration (public beta) enables a lead agent to spawn specialist subagents with distinct models, prompts, and tools running in parallel on a shared filesystem, with full trace visibility in Claude Console. Dreaming (research preview) schedules overnight reviews of past agent sessions to surface recurring mistakes and shared workflow patterns, updating memory automatically or queuing changes for developer approval.

Why it matters for automation/productivity: Outcomes removes manual prompt-iteration cycles from agentic pipelines; orchestration enables parallelising specialist tasks that previously required sequential tool calls; Dreaming provides a mechanism for deployed agents to improve on a team’s specific workload over time without retraining.

Key claims:

Up to 10 pp task-success improvement with Outcomes → vendor-internal benchmark, no independent replication found
Harvey: completion rate increased approximately 6× → vendor-cited customer result, no independent verification
Wisedocs: 50% faster document reviews → vendor-cited customer result, no independent verification

Cross-references:

https://sdtimes.com/ai/new-in-claude-managed-agents-dreaming-outcomes-and-multiagent-orchestration/ (T3, corroborating)
https://thenewstack.io/anthropic-managed-agents-dreaming-outcomes/ (T3, corroborating)

TinyFish opens web Search and Fetch to all AI agents at no cost

Source: https://www.tinyfish.ai/blog/search-and-fetch-are-now-free-for-every-agent-everywhere · TinyFish · 2026-05-04 Verification: T2 verified · announcement · mcp-ecosystem

TinyFish removed the paywall from its web Search and Fetch APIs on 4 May 2026. The free tier allows 5 search queries per minute and 25 URL fetches per minute across REST API, MCP server, Python and TypeScript SDKs, and CLI, with no credit card required. The underlying infrastructure is a custom Chromium fleet with JavaScript rendering, parallel execution, and bot-detection handling built in, making it suitable for pages that resist simpler HTTP scrapers. The MCP server and SDKs make the endpoints directly reachable from Claude, Cursor, Claude Code, Codex, and any MCP-compatible agent without additional integration work.

Why it matters for automation/productivity: Agent workflows that previously required a paid search or web-scraping subscription can now access search and fetch without spend, removing a cost barrier for both prototyping and low-volume production deployments.

Key claims:

5 queries/min search, 25 URLs/min fetch on free tier → TinyFish vendor blog (T2)
Compatible with Claude, Cursor, Claude Code, Codex, and all MCP-compatible clients → TinyFish vendor blog (T2)

OpenAI replaces ChatGPT’s default model with GPT-5.5 Instant

Source: https://techcrunch.com/2026/05/05/openai-releases-gpt-5-5-instant-a-new-default-model-for-chatgpt/ · TechCrunch citing OpenAI · 2026-05-05 Verification: T3 secondary · announcement · model-release Tier note: Primary URL openai.com/index/gpt-5-5-instant/ returned HTTP 403 during this run. TechCrunch article is based on OpenAI-provided data. Benchmark scores are vendor-reported.

OpenAI swapped GPT-5.3 Instant for GPT-5.5 Instant as the default ChatGPT model on 5 May 2026. In vendor evaluations, the new model scores 81.2 on AIME 2025 (up from 65.4 for its predecessor) and 76 on MMMU-Pro multimodal reasoning (up from 69.2); hallucination reduction in law, medicine, and finance domains is claimed but no test methodology was disclosed publicly. The rollout began for Plus and Pro users on web, with memory and context-management features included; Free, Go Business, and Enterprise users follow in the coming weeks. In the API the model is available as chat-latest; GPT-5.3 remains as a paid API option for three months.

Why it matters for automation/productivity: Teams using chat-latest in production pipelines receive the upgrade automatically; the AIME and MMMU-Pro score increases suggest improved performance on structured reasoning tasks, though independent replication was not available at the time of this run.

Key claims:

AIME 2025: 81.2 vs 65.4 for predecessor → vendor-reported, no independent replication found
MMMU-Pro: 76 vs 69.2 → vendor-reported, no independent replication found
Replaces GPT-5.3 Instant as default → vendor announcement (confirmed via secondary)

Caveats: Benchmark scores are from OpenAI’s own evaluation; no third-party replication published at launch. Hallucination-reduction claim appeared in press materials without a disclosed test methodology.

Anthropic launches ten financial services agent templates with Claude Opus 4.7

Source: https://www.anthropic.com/news/finance-agents · Anthropic · 2026-05-05 Verification: T2 verified · announcement · productivity-ai / ai-for-business

Anthropic released ten ready-to-run agent templates for financial services on 5 May 2026, covering pitchbook construction, KYC file screening, earnings review, financial model building, general ledger reconciliation, month-end closing, statement auditing, and valuation review. All templates run on Claude Opus 4.7 and scored 64.37% on the Vals AI Finance Agent benchmark (a third-party benchmark operator). Eight new data connectors became available at launch: Dun & Bradstreet, Fiscal AI, Financial Modeling Prep, Guidepoint, IBISWorld, SS&C IntraLinks, Third Bridge, and Verisk. Microsoft 365 add-ins for Excel, PowerPoint, and Word reached general availability; a Claude for Outlook add-in entered beta. Moody’s released an MCP server for credit ratings and company financial data accessible directly from Claude.

Why it matters for automation/productivity: The templates lower integration cost for finance teams automating research-heavy or compliance-adjacent workflows; the M365 add-ins allow Excel and PowerPoint users to invoke Claude without leaving familiar tooling; the Moody’s MCP connection adds a verified data source to agentic financial workflows.

Key claims:

64.37% on Vals AI Finance Agent benchmark → Vals AI (third-party benchmark operator), cited in Anthropic primary
M365 add-ins for Excel, PowerPoint, Word GA; Outlook in beta → Anthropic primary (T2)
8 data connectors at launch → Anthropic primary (T2)

Anthropic secures SpaceX Colossus compute, doubles Claude Code rate limits

Source: https://www.anthropic.com/news/higher-limits-spacex · Anthropic · 2026-05-06 Verification: T2 verified · announcement · model-release / ai-for-business

Anthropic agreed on 6 May 2026 to use the full capacity of SpaceX’s Colossus 1 data center in Memphis, Tennessee. The deal provides access to over 220,000 NVIDIA GPUs and over 300 megawatts of capacity, available within the month. Immediate effects for subscribers: Claude Code’s five-hour rate limits are doubled for Pro, Max, Team, and seat-based Enterprise plans; peak-hour throttling is removed for Pro and Max accounts; and API rate limits for Claude Opus models are increased (no specific percentage disclosed in the announcement). Anthropic also indicated interest in jointly developing orbital computing capacity with SpaceX as a future extension.

Why it matters for automation/productivity: Teams hitting Claude Code session limits during long agentic runs gain doubled throughput without a plan upgrade; Opus API users see less queuing during high-demand periods, which reduces production latency for agentic applications.

Key claims:

220,000+ NVIDIA GPUs at Colossus 1 → Anthropic primary; independently confirmed by NVIDIA’s official post
300+ megawatts capacity → Anthropic primary
Claude Code 5-hour limits doubled for Pro/Max/Team/Enterprise → Anthropic primary; effective immediately at announcement

Cross-references:

https://x.com/nvidia/status/2052091408643756296 (T2 — NVIDIA official account confirming GPU count)
https://www.bloomberg.com/news/articles/2026-05-06/anthropic-inks-computing-deal-with-spacex-to-meet-ai-demand (T2, corroborating — paywalled)

Unity launches Unity AI into open beta with an MCP Server for external coding agents

Source: https://80.lv/articles/unity-launches-in-editor-ai-tools-suite-in-beta · 80.lv · 2026-05-04 Verification: T3 secondary · announcement · dev-tools / mcp-ecosystem Tier note: Primary unity.com blog post was not accessible. 80.lv coverage and Unity community forum discussion corroborate the same launch. Date cited by 80.lv is May 4; Unity forum activity begins approximately May 2.

Unity released Unity AI into open beta for Unity 6 and above around 4 May 2026. The suite includes an AI Assistant aware of the project’s live scene hierarchy and component state; a Generators tool for asset creation from text prompts; an AI Gateway that routes requests to third-party frontier models from within the editor; and an MCP Server that exposes the Unity scene graph to external coding agents including Claude Code, Cursor, Windsurf, and Antigravity. Pricing is $10 per month for 1,000 AI credits after a 14-day, 1,000-credit free trial.

Why it matters for automation/productivity: The MCP Server enables an agent running in Claude Code or Cursor to inspect and modify a Unity scene without being embedded inside the Unity Editor, expanding the reach of external AI coding tools into game-development workflows.

Key claims:

$10/month for 1,000 AI credits → vendor pricing (T2 via secondary)
MCP Server exposes scene graph to Claude Code, Cursor, Windsurf, Antigravity → vendor documentation (T2 via secondary)
Available for Unity 6 and above → vendor announcement (T2 via secondary)

Caveats: Primary unity.com announcement page was not accessible; date uncertainty of approximately 2 days (May 2 or May 4). Integration depth of the MCP Server beyond scene-graph reads has not been independently assessed.

Cross-references:

https://discussions.unity.com/t/unity-ai-s-open-beta-now-live-for-unity-6/1718560 (T3, corroborating — community forum)
https://gamesbeat.com/unity-launches-unity-ai-into-open-beta/ (T3, corroborating)

Saperly launches as dedicated phone carrier for AI agents

Source: https://saperly.com/ · Saperly · 2026-05-05 Verification: T3 secondary · announcement · mcp-ecosystem Tier note: Primary source is the vendor product page. No independent tech press coverage was found as of this run. Discovery via practitioner diffusion on X (multiple accounts sharing within 24h).

Saperly launched on approximately 5 May 2026 as a carrier infrastructure layer for AI agents that need persistent, compliant phone identity. An agent provisioned through Saperly receives a stable phone number, voice and SMS routing, and a consistent caller-ID across outbound calls, removing the need to manage telecom API complexity within agent orchestration code. The company ships an MCP server as its primary integration interface, making phone capabilities available to any MCP-compatible agent without additional telecom SDK work.

Why it matters for automation/productivity: Agents that need to make or receive calls — scheduling, verification, customer-service automation — can now acquire a stable, compliant phone identity through an MCP integration rather than building carrier-layer handling into the agent’s own codebase.

Caveats: Product is newly launched; reliability, pricing, and SLA details are not publicly documented. No enterprise adoption evidence exists at this stage. Independent evaluation of the MCP server is not yet available.

Sierra closes $950M round at $15B valuation

Source: https://siliconangle.com/2026/05/04/ai-agent-startup-sierra-valued-15b-new-950m-funding-round/ · SiliconAngle · 2026-05-04 Verification: T3 secondary · funding · ai-for-business / agent-framework

Sierra, the enterprise AI agent platform founded in 2024 by Bret Taylor and Clay Bavor, raised $950 million at a $15 billion valuation on 4 May 2026. The round was led by Alphabet’s GV and Tiger Global, with Benchmark, Sequoia, and Greenoaks participating. Sierra reports $150 million in annual recurring revenue and claims adoption by “nearly half the Fortune 50” (both figures vendor-stated, no independent confirmation). The platform includes an Agent SDK, Agent Studio for code-free agent development, and pre-packaged connectors to third-party data sources; it runs across more than 15 open-source and proprietary models.

Why it matters for automation/productivity: The valuation and ARR figures, if accurate, reflect enterprise willingness to spend on managed agent infrastructure; independent reviews note high setup costs ($50k–$200k) and 3–6 month deployment timelines as counterweights to the adoption claims.

Key claims:

$950M raised, $15B valuation → SiliconAngle citing company and investor statements; corroborated by TechCrunch and CNBC
$150M ARR → vendor-stated, no independent confirmation
“nearly half the Fortune 50” → vendor-stated, no independent confirmation

Caveats: ARR and penetration figures are vendor-reported without audited verification. Independent review aggregators document typical customer costs around $150k per year, 3–6 month deployments, and opaque outcome-based billing that can make ROI modeling difficult before signing.

Cross-references:

https://techcrunch.com/2026/05/04/sierra-raises-950m-as-the-race-to-own-enterprise-ai-gets-serious/ (T3, corroborating)
https://www.cnbc.com/2026/05/04/bret-taylor-sierra-fundraise-openai.html (T3, corroborating)

Anthropic, Blackstone, and Goldman Sachs form a new enterprise AI services company

Source: https://www.anthropic.com/news/enterprise-ai-services-company · Anthropic · 2026-05-04 Verification: T2 verified · announcement · ai-for-business

Anthropic announced the formation of a new AI services company on 4 May 2026, co-founded with Blackstone, Hellman & Friedman, and Goldman Sachs. General Atlantic, Leonard Green, Apollo Global Management, GIC, and Sequoia Capital are backing investors. The firm targets mid-sized organizations — community banks, regional manufacturers, regional health systems — that need Claude deployment expertise but lack the internal AI engineering capacity for implementation. Anthropic’s Applied AI engineers will work alongside the new company’s delivery teams to identify high-impact use cases and build Claude-powered solutions. No launch date, pricing, or geographic scope was disclosed.

Why it matters for automation/productivity: For mid-market enterprises without dedicated AI teams, a vendor-backed implementation partner reduces the upfront engineering barrier to deploying Claude; the lack of pricing transparency at launch makes cost comparison with general consulting partnerships impossible at this stage.

Mythos AI draws scrutiny from EU officials and the White House

Source: https://www.resultsense.com/news/2026-05-05-white-house-pre-release-ai-vetting/ · ResultSense · 2026-05-05; https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities · UK AISI (capabilities assessment) Verification: T3 secondary · policy / investigation · policy-regulation Tier note: Bloomberg EU-angle article (May 5) is behind a paywall; ResultSense and a White House spokesperson non-denial constitute the accessible US record. EU Economy Commissioner statements sourced from secondary coverage of the Bloomberg article.

Anthropic’s Mythos model — still in restricted testing, not publicly released — drew regulatory attention from two directions in the May 4–7 window. The European Commission and EU cybersecurity agency ENISA were scheduled to examine Mythos risks in the European Parliament on 6 May; Anthropic did not attend on short notice. EU Economy Commissioner Valdis Dombrovskis confirmed ongoing talks with Anthropic about giving European banks access to Mythos for cybersecurity resilience testing, citing concern that US entities gain earlier visibility into AI-discovered vulnerabilities. Separately, the White House is considering an executive order establishing a pre-release AI model review working group, with Mythos’s reported cyber capabilities cited as a catalyst; a White House spokesperson did not deny the substance but described specific executive order discussions as “speculation.” The UK AI Safety Institute published an evaluation finding Mythos Preview can autonomously execute multi-stage attacks on vulnerable networks and discover zero-day vulnerabilities in open-source codebases within hours — tasks that expert penetration testers said would take days.

Why it matters for automation/productivity: If a pre-release review process is established, organizations planning frontier-model integrations may face additional lead time before new models reach API availability. The EU’s interest in testing banks with Mythos suggests a near-term policy use case for offensive AI capabilities in regulated industries.

Key claims:

Mythos Preview executes multi-stage attacks on vulnerable networks autonomously → UK AISI (T1)
Mythos discovers zero-day vulnerabilities in open-source codebases within hours → UK AISI (T1)
EU Economy Commissioner confirmed talks with Anthropic about Mythos testing access → secondary coverage (T3)
White House considering pre-release review working group → ResultSense (T3); White House spokesperson non-denial

Caveats: Mythos has not been publicly released; no API, no pricing, no release date announced. AISI evaluation is of a restricted preview version; production capabilities may differ. The White House executive order remains deliberative, not signed.

Dropped

Items considered but not published, with reason:

Title considered	Source	Reason
Claude Security public beta	anthropic.com/news (Apr 30)	Already covered in 2026-05-04 radar
Microsoft Agent 365 GA	microsoft.com/security/blog (May 1)	Already covered in 2026-05-04 radar
Mistral Medium 3.5 + Work Mode	mistral.ai/news (Apr 29)	Already covered in 2026-05-03 and 2026-05-04 radars
Manus Cloud Computer	manus.im/blog (Apr 30)	Already covered in 2026-05-04 radar
xAI Grok 4.3 + Voice Cloning API	x.ai/news (Apr 30–May 2)	Already covered in 2026-05-04 radar
Claude Code v2.1.126 gateway support	github.com/anthropics/claude-code/releases (May 1)	Already covered in 2026-05-04 radar
Lukilabs Craft Agents OSS	github.com/lukilabs/craft-agents-oss (May 2)	Already covered in 2026-05-04 radar
Code Review for Claude Code	claude.com/blog/code-review (Mar 9)	Outside 72h window
CI auto-fix / Preview-review-merge	claude.com/blog/preview-review-and-merge (Feb 20)	Outside 72h window
Claude Code Routines	claude.com/blog/introducing-routines-in-claude-code (Apr 14)	Outside 72h window; showcased at conference but not newly launched
Gmail Gemini AI features	blog.google/gmail-is-entering-the-gemini-era (Jan 8)	Outside 72h window
GLM-5.1 open-weight model by Z.ai	huggingface.co/zai-org/GLM-5.1 (Apr 7)	Outside 72h window
ARIS autonomous research paper	huggingface.co/papers/2605.03042 (May 6)	No shipping product; low BD actionability
OpenSeeker-v2 search agent paper	huggingface.co/papers/2605.04036 (May 6)	No shipping product; low BD actionability
Anthropic Orbit proactive assistant	testingcatalog.com (unconfirmed leak)	Unannounced feature; T5 — no official source
DeepSeek V4 models	Various aggregators	No primary source with confirmed May 4–7 date found
L Suite Lloyd legal AI tool	finance.yahoo.com (May 6)	T4 only at time of research; no primary source accessible
White House National AI Policy Framework	whitehouse.gov (Dec 2025)	Outside 72h window

Limitations

Sources unreachable: openai.com/index/gpt-5-5-instant/ returned HTTP 403; Bloomberg EU-Mythos article returned HTTP 403 (paywalled). Both items are covered via secondary sources with tier downgrade noted inline.
Login-walled coverage: X timelines as a logged-in user, Instagram, LinkedIn private feeds, and Discord were not accessed. Public X posts indexed by search engines were captured; NVIDIA’s official X post was used as a corroborating source for the SpaceX compute deal GPU count.
Unity AI date uncertainty: Primary unity.com announcement page was not accessible. Two secondary sources cite May 4, 2026; Unity community forum activity begins approximately May 2. Date listed as approximately May 4 with a ±2-day caveat noted inline.
Claude Code conference features: Multiple Claude Code capabilities demonstrated at Code with Claude (May 6) — Code Review, CI auto-fix, Remote Agents — had primary source dates of February–March 2026, prior to this window; they are listed in Dropped as outside the 72h window. The conference itself does not constitute a new launch date for previously-published features.
Saperly — no independent coverage: The Saperly launch was identified via practitioner diffusion on X only; no independent tech press article was found as of this run. Item listed at T3 secondary with caveats noted.
Mythos model status and access: Mythos has not been publicly released; no API access, pricing, or release date has been announced. AISI evaluation is of a restricted preview version; production capabilities may differ from the evaluated model.
Vendor benchmark claims unverified: Claude Managed Agents Outcomes improvement (up to 10 pp), Claude Opus 4.7 Vals AI Finance score (64.37%), and OpenAI GPT-5.5 AIME/MMMU-Pro scores are all vendor-reported or vendor-cited; independent benchmark replication was not available at the time of this run.
Geographic bias: US/EU coverage dominates this window. Indonesian-language search and SEA-region source searches yielded no in-window items from local AI vendors or regional deployments. The gap persists as a structural limitation.
No research-papers items this window: ARIS and OpenSeeker-v2 papers (both May 6 on HuggingFace) were assessed but had low immediate BD actionability with no shipping products behind them; dropped.
Window notes: The strict 72h window (May 4–7) yielded 10 verified or secondary items; no window expansion was required.

Search log (compact)

Query	Yield	Type
Anthropic Claude announcement May 2026	10 results, 8 high-rel	registry
OpenAI announcement release May 2026	10 results, 6 high-rel	registry
Google DeepMind Gemini release May 2026	10 results, 4 high-rel	registry
anthropic.com/news/finance-agents (fetch)	primary confirmed May 5	registry
anthropic.com/news/enterprise-ai-services-company (fetch)	primary confirmed May 4	registry
OpenAI GPT-5.5 release May 5 2026 site:openai.com	8 results, 5 high-rel	registry
openai.com/index/gpt-5-5-instant/ (fetch)	HTTP 403 — inaccessible	registry
techcrunch.com GPT-5.5 Instant (fetch)	T3 secondary confirmed May 5	registry
Code with Claude conference announcements May 6 2026	10 results, 7 high-rel	registry
simonwillison.net/2026/May/6/code-w-claude-2026/ (fetch)	T3 liveblog confirmed May 6	registry
claude.com/code-with-claude (fetch)	conference details confirmed	registry
Claude Managed Agents Outcomes Dreaming Orchestration May 6 2026	8 results, 7 high-rel	registry
claude.com/blog/new-in-claude-managed-agents (fetch)	primary confirmed May 6	registry
Anthropic SpaceX Colossus computing deal May 2026	10 results, 8 high-rel	registry
anthropic.com/news/higher-limits-spacex (fetch)	primary confirmed May 6	registry
github.com/anthropics/claude-code/releases (fetch)	v2.1.128–v2.1.132 May 4–6	registry
AI agent framework launch release May 2026	10 results, 5 high-rel	exploratory
new AI startup launch funding May 2026 NOT Anthropic NOT OpenAI	10 results, 4 high-rel	exploratory
siliconangle.com Sierra funding (fetch)	T3 confirmed May 4	exploratory
huggingface.co/papers (fetch)	top papers May 6 identified	exploratory
AI Indonesia startup peluncuran Mei 2026	10 results, 0 in-window	exploratory/cross-lang
MCP Model Context Protocol new release May 2026	10 results, 3 high-rel	exploratory
Anthropic Orbit proactive assistant Claude Cowork May 2026	10 results, 2 high-rel	exploratory/adversarial
AI policy regulation executive order May 2026	10 results, 3 high-rel	exploratory
Anthropic Mythos cybersecurity EU testing May 2026	10 results, 8 high-rel	exploratory
resultsense.com White House AI vetting (fetch)	T3 confirmed May 5	exploratory
aisi.gov.uk Mythos evaluation (search)	T1 confirmed	exploratory
site:x.com AnthropicAI Code with Claude SpaceX May 2026	10 results, 6 high-rel	exploratory
new AI tool launch May 4-7 2026 NOT Anthropic NOT OpenAI	10 results, 3 high-rel	exploratory
Saperly phone carrier AI agents MCP May 2026	8 results, 3 high-rel	exploratory
saperly.com (fetch)	vendor product page May 5	exploratory
TinyFish web search fetch free AI agents May 2026	8 results, 5 high-rel	exploratory
tinyfish.ai/blog/search-fetch-free (fetch)	primary confirmed May 4	exploratory
Unity AI open beta May 2026 site:unity.com	8 results, 4 high-rel	exploratory
claude.com/blog (fetch)	2 posts in May 4–7 window	registry
site:claude.com CI auto-fix code review remote agents May 2026	8 results, 5 high-rel	registry
claude.com/blog/code-review (fetch)	March 9 — outside window	adversarial
claude.com/blog/preview-review-and-merge (fetch)	February 20 — outside window	adversarial
Sierra AI agent startup review criticism May 2026	10 results, 4 high-rel	adversarial
AI benchmark criticism independent evaluation May 2026	10 results, 3 high-rel	adversarial
Mistral xAI Grok new model release May 2026	10 results, 3 high-rel	registry

Total searches: 41, of which 17 exploratory or adversarial (41%).

Suggested next runs

Claude Managed Agents Outcomes — independent benchmark — No third-party replication of the 10 pp task-success improvement claim at this run; worth a lateral search once independent eval labs publish.
Mythos regulatory track — Both EU and US processes are moving; the next two weeks will clarify whether the White House executive order progresses to a draft and whether Anthropic grants EU access for bank testing.
Sierra production adoption evidence — $150M ARR and near-half Fortune 50 claims are vendor-only at this stage; independent customer case studies or ARR verification from a named investor would upgrade this item.
Unity AI MCP Server depth — Scene-graph read/write depth of the MCP integration has not been independently assessed; worth a hands-on evaluation for teams building Claude Code workflows that touch game assets.