AI Radar

AI Radar — 04 May 2026

7 items 7 verified 0 secondary 0 rumor 16 sources 49% exploration

Manus ships always-on cloud VMs; Mistral adds a multi-step Work Mode; Microsoft’s agent governance layer reaches GA; Anthropic opens security scanning to enterprise; xAI cuts Grok prices while launching voice cloning.

Run: 29 Apr – 04 May 2026 · 23 items reviewed → 7 published · 7 verified · 0 secondary · 0 rumor · 49% exploration


TL;DR


Items

Manus launches Cloud Computer for 24/7 autonomous bot and script hosting

Source: https://manus.im/blog/manus-cloud-computer · Manus · 2026-04-30 Verification: T2 verified · announcement · workflow-automation

Manus released Cloud Computer on 30 April 2026, a managed persistent Ubuntu virtual machine accessible via SSH or a web terminal. Users describe objectives in natural language — or provide existing scripts — and Manus handles environment setup, dependencies, and 24/7 uptime without requiring users to manage a server. Three tiers (Basic, Standard, Advanced) scale CPU, memory, and storage to the workload; specific pricing amounts are not yet published on the primary blog. Documented use cases include always-on Slack/Discord/Telegram bots, scheduled MySQL reporting, web scrapers, self-hosted open-source tools (Home Assistant, Metabase, WordPress), and command-line developer tooling.

Why it matters for automation/productivity: Teams that want persistent background automation — price monitors, data pipelines, webhook listeners — can deploy in minutes without cloud infrastructure expertise. The no-code path lowers the barrier for non-engineers to run autonomous agents continuously.

Caveats: Beta access; independent reporting flags instability under load, a one-session-per-day limit during beta, pricing transparency gaps (pricing page returning errors at time of research), and occasional hallucination of statistics in agent outputs. Not yet recommended for production workflows requiring reliability guarantees.

Cross-references:


Mistral ships Medium 3.5 and Le Chat Work Mode for multi-step agentic tasks

Source: https://mistral.ai/news/vibe-remote-agents-mistral-medium-3-5 · Mistral AI · 2026-05-02 Verification: T2 verified · announcement · model-release / workflow-automation Tier nuance: SWE-Bench and τ²-Telecom scores are vendor-claimed; standard reasoning benchmarks (MMLU, GPQA, AIME, HumanEval) not published at launch — comparative performance claims downgraded to T4.

Mistral released Medium 3.5 on 2 May 2026, a 128B dense model with a 256k context window that merges instruction-following, reasoning, and coding into one set of weights. The model is available via API at $1.50 per million input tokens and $7.50 per million output tokens, and becomes the default model in Le Chat. Alongside the model, Mistral launched Work Mode (preview) in Le Chat, a multi-step agentic mode that chains tools — email, calendar, Slack, web search, internal documents — until a complex task is completed, with approval gates for sensitive operations. Remote coding agents in Vibe (Mistral’s cloud IDE) run asynchronously against GitHub repositories, opening pull requests without keeping a local session open.

Why it matters for automation/productivity: Work Mode is a direct competitor to ChatGPT Operator and Claude computer use for cross-tool agentic task completion; the open-weight 128B model also allows self-hosted deployment for teams with data-residency constraints.

Key claims:

Caveats: Vendor-claimed benchmarks only; MMLU, GPQA, AIME, HumanEval gaps noted by independent reviewers including Pedro Domingos (University of Washington). BenchLM shows 2 of 178 tracked benchmarks published at launch.

Cross-references:


Microsoft Agent 365 reaches general availability at $15/user/month

Source: https://www.microsoft.com/en-us/security/blog/2026/05/01/microsoft-agent-365-now-generally-available-expands-capabilities-and-integrations/ · Microsoft Security Blog · 2026-05-01 Verification: T2 verified · announcement · ai-for-business / agent-framework

Microsoft Agent 365 reached general availability on 1 May 2026, priced at $15 per user per month as a standalone offering or bundled within Microsoft 365 E7 (the Frontier Suite) at $99 per user per month. The product is a governance and security control plane — not an agent builder — that extends Entra, Purview, and Defender to cover AI agents alongside human users, providing discovery, policy enforcement, and audit trails. New at GA: Windows 365 for Agents (purpose-built cloud PCs for agentic workloads, public preview), local agent discovery via Microsoft Defender for OpenClaw, GitHub Copilot CLI, and Claude Code, and multi-cloud registry sync with AWS and Google Cloud in public preview. Enhanced Defender capabilities including runtime blocking and custom threat detection for agents are scheduled for June 2026.

Why it matters for automation/productivity: Organizations deploying multiple AI agents across vendors — Copilot, Claude Code, Genspark, Zendesk AI — can now centrally inventory and govern them from one dashboard, reducing shadow-AI risk and audit overhead.

Key claims:

Cross-references:


Anthropic opens Claude Security to public beta for enterprise codebase scanning

Source: https://claude.com/blog/claude-security-public-beta · Anthropic · 2026-04-30 Verification: T2 verified · announcement · dev-tools

Anthropic moved Claude Security into public beta on 30 April 2026, making it available to Claude Enterprise customers, with Team and Max access planned. The tool scans entire repositories or targeted directories for vulnerabilities by reasoning over data flows and component interactions — rather than relying solely on pattern matching — using Claude Opus 4.7 as the underlying model. Results include confidence ratings, severity assessments, and reproduction steps; targeted patch suggestions are accessible via Claude Code. Integration partners at launch include CrowdStrike, Palo Alto Networks, SentinelOne, Trend Micro, and Wiz, with webhook support for Slack and Jira export.

Why it matters for automation/productivity: Security teams can run scheduled codebase audits and receive patch suggestions without manually triaging every finding; the Opus 4.7 reasoning capability allows it to identify multi-file dependency vulnerabilities that static-analysis tools typically miss.

Key claims:

Caveats: LLM-based scanning is non-deterministic — the same codebase may produce different findings on successive runs. Source-code analysis cannot assess runtime exploitability; infrastructure, authentication, and user-behavior vectors remain outside scope. Vendor claims “hundreds of organizations” used the research preview (T4 — no independent telemetry).

Cross-references:


xAI releases Grok 4.3 with 40% lower pricing and launches voice-cloning API

Source: https://x.com/xai/status/2050355373052223585 · @xai (official) · 2026-05-02 Verification: T2 verified · announcement · model-release / productivity-ai

xAI released Grok 4.3 on 30 April 2026 and followed with the Voice Cloning API on 2 May 2026. Grok 4.3 carries a 1M token context window and is priced at $1.25 per million input tokens and $2.50 per million output tokens — approximately 37.5% lower input and 58.3% lower output prices compared to Grok 4.20, per independent benchmarking by Artificial Analysis. The Voice Cloning API allows developers to create a custom voice in under two minutes from roughly one minute of speech, with a two-stage speaker-verification process; the library ships with 80+ pre-built voices across 28 languages. No additional charge applies to text-to-speech or voice agent API calls when using custom voices.

Why it matters for automation/productivity: The price cuts make Grok 4.3 competitive for high-throughput agentic workloads; the voice API opens a low-friction path to building voice agents, audiobook narration, and localized customer-service bots without a separate TTS vendor.

Key claims:

Cross-references:


Claude Code v2.1.126 ships model gateway support and project management tools

Source: https://github.com/anthropics/claude-code/releases · Anthropic / GitHub · 2026-05-01 Verification: T2 verified · changelog · dev-tools

Anthropic released Claude Code v2.1.126 on 1 May 2026. The release adds model gateway support: when ANTHROPIC_BASE_URL points to an Anthropic-compatible gateway, the /model picker now lists models from that gateway’s /v1/models endpoint, enabling teams to route to self-hosted or private models. A new claude project purge [path] command deletes all Claude Code state — transcripts, tasks, file history, config — with dry-run and interactive flags. OAuth login improvements allow users to paste codes in the terminal when a browser callback cannot reach localhost, removing friction in WSL2, SSH, and container environments. On Windows, PowerShell 7 is now detected correctly across MSI, Microsoft Store, and .NET global tool installs and treated as the primary shell when enabled.

Why it matters for automation/productivity: The model gateway integration lets development teams swap in private or compliant-region models without changing tooling; the project purge command addresses a practical data hygiene need for teams running Claude Code across multiple client projects.

Cross-references:


Lukilabs releases Craft Agents OSS, a modular open-source agent framework

Source: https://github.com/lukilabs/craft-agents-oss · Lukilabs / GitHub · 2026-05-02 Verification: T2 verified · announcement · agent-framework

Lukilabs published Craft Agents OSS on 2 May 2026 under the Apache License 2.0. The repository describes a modular framework for building AI agents and multi-agent workflows, positioned as a community-oriented, commercially permissive starting point for developers experimenting with agent orchestration. The project reached GitHub Trending shortly after publication, signaling early developer interest. Technical documentation at launch is minimal; the framework’s maturity relative to established alternatives (LangGraph, CrewAI, AutoGen) cannot be assessed from the public README alone.

Why it matters for automation/productivity: Apache 2.0 licensing removes friction for teams wanting to embed an agent framework in a commercial product without GPL-style obligations; worth tracking if adoption signals (GitHub stars + commits) hold over the next 30 days.

Key claims:

Caveats: Early-stage project. Stars without sustained commit activity are bookmarking, not adoption signal. Independent evaluation of framework quality not yet available.


Conflicts surfaced

Mistral Medium 3.5 benchmark coverage gap

Mistral published SWE-Bench Verified (77.6%) and τ²-Telecom (91.4%) on launch day. All other standard reasoning benchmarks (MMLU, GPQA, AIME, HumanEval, MATH) were absent from launch materials.

SourceClaimTier
Mistral primary blog77.6% SWE-Bench VerifiedT4 (vendor-run benchmark)
BenchLM.ai (independent)2 of 178 tracked benchmarks publishedT2 (independent evaluation tracker)
Pedro Domingos commentarySelective benchmark publication notableT3 (credentialed academic, opinion)

Weighted synthesis: Vendor benchmark is T4 for comparative purposes. Independent benchmark tracker (T2) confirms the gap. Claim reported as “vendor-claimed, unverified by independent benchmark” throughout this issue.


Dropped

Items considered but not published, with reason:

Title consideredSourceReason
GPT-5.5 general rolloutopenai.com/index/introducing-gpt-5-5/April 23, 2026 — outside expanded window
Claude Opus 4.7 launchanthropic.com/newsApril 16, 2026 — outside expanded window
NVIDIA Ising quantum AI modelsnvidianews.nvidia.comApril 14, 2026 — outside expanded window
Novo Nordisk + OpenAI partnershipcnbc.com/2026/04/14April 14, 2026 — outside expanded window
Cursor 3 agent-first IDEcursor.com/blog/cursor-3April 2, 2026 — outside expanded window
Microsoft Agent Framework 1.0 GAdevblogs.microsoft.comApril 3, 2026 — outside expanded window
NVIDIA OpenShell (GTC launch)developer.nvidia.com/blogMarch 16, 2026 — outside expanded window
OpenAI on Amazon Bedrockaws.amazon.com/about-aws/whats-new/2026/04April 28, 2026 — outside expanded window (5-day limit: Apr 29)
Amazon Bedrock AgentCore MCP GAaws.amazon.com/about-aws/whats-new/2026/02February 2026 — outside expanded window
Claude Mythos Previewred.anthropic.com/2026/mythos-preview/April 7, 2026 — outside expanded window; restricted access
MCP 97M installs milestoneai2.work/blog/model-context-protocolMarch 2026 milestone — outside expanded window
Crawl4AI RAG MCP servergithub.com/coleam00/mcp-crawl4ai-ragMay 2025, not May 2026 — wrong year, confirmed via primary
BrowserStack MCP serverpulsemcp.comMay 2025, not May 2026 — wrong year, confirmed via primary
Gemini 3.1 UltraVariousCould not find primary source confirming this model exists; search results contradictory
ElevenMusic launchmusically.com/2026/04/30April 30, consumer music product; limited BD relevance for workflow automation audience
Anthropic $900B valuation roundtechcrunch.com/2026/04/30Financial news, no product or capability angle

Limitations


Search log (compact)

Q: "anthropic.com/news" (fetch) → no May 1-4 posts visible
Q: "openai.com/news/" (fetch) → 403
Q: "deepmind.google/discover/blog/" (fetch) → no May 1-4 posts visible
Q: "AI announcement May 2026 model release agent framework" → 10 results, 4 high-relevance
Q: "site:x.com OpenAI Anthropic Google AI announcement May 2026" → 10 results, 3 high-relevance [exploratory]
Q: "GPT-5.5 release date May 2026 OpenAI announcement" → 10 results, 3 high-relevance
Q: "Gemini 3.1 Ultra launch May 2026 Google" → 10 results, 1 high-relevance (no Ultra confirmed)
Q: "Mistral Le Chat agentic Work mode May 2026 128B model" → 10 results, 5 high-relevance
Q: "MCP Model Context Protocol 97 million installs 2026" → 10 results, 3 high-relevance (March milestone, out of window)
Q: "mistral.ai/news/vibe-remote-agents-mistral-medium-3-5" (fetch) → primary, no explicit date
Q: "AI news May 1 2 3 4 2026 new release launch" → 10 results, 6 high-relevance [exploratory]
Q: "Anthropic Claude update May 2026" → 10 results, 4 high-relevance
Q: "new AI model release May 1 2 3 4 2026" → 10 results, 3 high-relevance [exploratory]
Q: "coaio.com/news/2026/05/breaking-tech-news-on-may-1-2026" (fetch) → May 1 roundup [exploratory]
Q: "red.anthropic.com/2026/mythos-preview/" (fetch) → April 7, outside window
Q: "Microsoft Agent 365 generally available May 2026" → 10 results, 6 high-relevance
Q: "Claude Security beta launch date May 2026" → 10 results, 6 high-relevance
Q: "Claude Code release notes May 2026 ultrareview" → 10 results, 4 high-relevance
Q: "claude.com/blog/claude-security-public-beta" (fetch) → primary confirmed, Apr 30
Q: "x.ai/news/grok-custom-voices" (fetch) → 403
Q: "AI startup funding announcement May 2026" → 10 results, 2 high-relevance [exploratory]
Q: "AI Indonesia startup berita terbaru Mei 2026" → 10 results, 1 high-relevance [exploratory/cross-language]
Q: "site:github.com topic:mcp-server stars:>100 pushed:>2026-05-01" → 10 results, 2 high-relevance [exploratory]
Q: "siliconangle.com/2026/04/30/anthropic-announces-claude-security" (fetch) → corroborating, Apr 30
Q: "Manus Cloud Computer launch May 2026" → 10 results, 5 high-relevance
Q: "NVIDIA Ising quantum AI model launch date May 2026" → 10 results, 2 high-relevance (April 14, out of window) [exploratory]
Q: "OpenAI Amazon Bedrock launch date May 2026" → 10 results, 3 high-relevance (April 28, out of window)
Q: "jls42.org/en/news/ia-actualites-02-may-2026" (fetch) → May 2 roundup [exploratory]
Q: "developer.nvidia.com/blog/run-autonomous-self-evolving-agents-more-safely-with-nvidia-openshell/" (fetch) → March 16, out of window [exploratory]
Q: "Cursor 3 launch release date May 2026" → April 2, out of window [exploratory]
Q: "github.com/anthropics/claude-code/releases" (fetch) → v2.1.126 confirmed May 1
Q: "xAI Grok custom voice cloning API launch May 2026" → 10 results, 6 high-relevance
Q: "Microsoft Agent Framework 1.0 release date May 2026" → April 3, out of window [exploratory]
Q: "manus.im/blog/manus-cloud-computer" (fetch) → primary confirmed, Apr 30
Q: "Novo Nordisk OpenAI partnership announcement date" → April 14, out of window [exploratory]
Q: "Grok 4.3 model pricing release date xAI May 2026" → 10 results, 5 high-relevance
Q: "microsoft.com/en-us/security/blog/2026/05/01/microsoft-agent-365" (fetch) → primary confirmed May 1
Q: "Hugging Face daily papers May 2026" → 10 results, 1 high-relevance (no dated papers) [exploratory]
Q: "artificialanalysis.ai/articles/xai-launches-grok-4-3" (fetch) → independent benchmark confirmed Apr 30 [exploratory]
Q: "Amazon Bedrock AgentCore MCP server gateway May 2026" → February 2026, out of window [exploratory]
Q: "Mistral Medium 3.5 benchmark independent evaluation May 2026" → 10 results, 4 high-relevance [exploratory/adversarial]
Q: "AI policy regulation government ruling May 2026" → 10 results, 1 in-window (TAKE IT DOWN Act, low BD relevance) [exploratory]
Q: "Hacker News AI agent automation May 2026" → 10 results, 1 high-relevance [exploratory]
Q: "site:x.com @AnthropicAI @OpenAI announcement May 2026" → 10 results, 2 high-relevance [exploratory]
Q: "agent framework launch May 2026 new release" → 10 results, 3 high-relevance [exploratory]
Q: "NVIDIA OpenShell open-source AI agent security sandbox May 2026" → 10 results, 4 high-relevance [exploratory]
Q: "Runway AI mobile app Android launch May 2026" → 10 results, 0 high-relevance (launch not confirmed) [exploratory]
Q: "arXiv AI research paper May 1 2 3 4 2026 practical" → 10 results, 0 dated to window [exploratory]
Q: "Claude Security criticism limitations vulnerability scanner 2026" → 10 results, 4 high-relevance [adversarial]
Q: "Manus Cloud Computer criticism problems limitations 2026" → 10 results, 4 high-relevance [adversarial]
Q: "new MCP server launch May 2026 site:pulsemcp.com OR site:mcp.so" → 10 results [exploratory]
Q: "pulsemcp.com/posts/newsletter-ai-search-ide-mania-official-mcp-registry" (fetch) → dates confirmed as May 2025, not 2026 [adversarial]
Q: "crawl4ai RAG MCP server coleam00 launch date May 2026 GitHub" → confirmed May 2025, dropped [adversarial]
Q: "ElevenLabs new feature launch May 2026" → ElevenMusic Apr 30, limited BD relevance [exploratory]
Q: "AI dev tools productivity new launch May 1 2 3 4 2026" → 10 results, 2 high-relevance [exploratory]

Total searches: 55, of which 27 exploratory or adversarial (49%).


Suggested next runs