AI Radar

AI Radar — 09 May 2026

5 items 4 verified 1 secondary 0 rumor 14 sources 42% exploration

OpenAI adds voice reasoning and live translation to its Realtime API; Google ships Gemini 3.1 Flash-Lite to general availability; Coder launches self-hosted agent infrastructure for enterprise teams; Snyk embeds Claude for AI-native security; Harvey open-sources the first agent-native legal benchmark.

Run: 06–09 May 2026 · 28 items reviewed → 5 published · 4 verified · 1 secondary · 0 rumor · 42% exploration · Run timestamp: 2026-05-09


TL;DR


Items

OpenAI ships three voice API models with GPT-5-class reasoning, live translation, and streaming transcription

Source: https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/ · OpenAI · 2026-05-07 Verification: T2 secondary · announcement · dev-tools / agent-framework Tier nuance: Primary URL returned HTTP 403 during this run. TechCrunch (T2) and The Next Web (T2) independently confirmed announcement date and full pricing.

OpenAI released three new models in its Realtime API on 7 May 2026. GPT-Realtime-2 brings GPT-5-class reasoning to live voice interactions, with a 128K context window (up from 32K), parallel tool calls, and adjustable reasoning-effort levels from low through xhigh. GPT-Realtime-Translate supports real-time speech translation across 70+ input languages into 13 output languages while keeping pace with the speaker. GPT-Realtime-Whisper provides streaming speech-to-text transcription as the speaker talks, priced at $0.017/minute.

Why it matters for automation/productivity: Customer service, IVR, and real-time interpretation workflows can now access a voice model that reasons mid-conversation, calls tools in parallel, and recovers from task failures — replacing the need to chain separate transcription, reasoning, and response models.

Key claims:

Cross-references:

Caveats: Realtime API is not HIPAA-eligible under the OpenAI Business Associate Agreement as of this run; Azure OpenAI text endpoints are eligible but the audio modality is not. Default reasoning effort is low; the Big Bench Audio benchmark was run at xhigh — production performance at default settings will differ. The Zillow call-success figure is a vendor-reported case study. Session drift occurs in long conversations; context trimming is required.


Coder launches self-hosted, model-agnostic enterprise agent platform in beta

Source: https://coder.com/blog/introducing-coder-agents · Coder · 2026-05-06 Verification: T2 verified · announcement · dev-tools / agent-framework

Coder released Coder Agents in beta on 6 May 2026, a platform for running AI coding workflows on self-hosted infrastructure. The platform supports any major AI model provider — Anthropic, OpenAI, Google, and AWS Bedrock, plus OpenAI-compatible self-hosted models — without sending source code or prompts to external services. It exposes a conversational interface and API, centralized model and prompt-policy management, MCP-based tool integrations, sub-agent orchestration, and network-isolated workspace provisioning. Full premium features are available at no cost through September 2026; post-beta pricing has not been announced.

Why it matters for automation/productivity: Organizations with strict code-governance or data-residency requirements can run coding agents entirely within their network perimeter. The MCP and sub-agent architecture allows existing internal tooling to connect to the agent workflow without custom integration code.

Key claims:

Cross-references:

Caveats: Beta — API may have breaking changes before GA. The 70% statistic on agent infrastructure mismatch is from Coder’s own unpublished research. Pricing after September 2026 not announced.


Google ships Gemini 3.1 Flash-Lite to general availability

Source: https://cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-flash-lite-is-now-generally-available · Google Cloud · 2026-05-08 Verification: T2 verified · announcement · model-release

Google moved Gemini 3.1 Flash-Lite to general availability on 8 May 2026 via the Gemini API and Vertex AI, two months after its March 2026 preview. The model targets high-volume, latency-sensitive tasks: tool calling, content moderation, and orchestration steps in agentic pipelines. According to its preview announcement, it is priced at $0.25/1M input tokens and $1.50/1M output tokens, with 2.5× faster time-to-first-answer and 45% higher output speed compared to Gemini 2.5 Flash. Gladly, a customer service platform, reported approximately 60% lower costs versus thinking models and a p95 classifier-call latency under one second in production.

Why it matters for automation/productivity: At $0.25/1M input tokens, Flash-Lite is among the lower-cost options for high-volume steps in agentic pipelines — classification, routing, and lightweight orchestration where a full reasoning model would be cost-prohibitive.

Key claims:

Cross-references:

Caveats: Pricing cited from the March 2026 preview announcement; the GA blog post did not republish pricing — verify against current Google Cloud pricing page before committing to production. Independent FACTS benchmark score: 40.6% for Flash-Lite vs 50.4% for Gemini 3.0 Flash Dynamic — the model trades factual grounding for speed. Independent TTFT measured at 5.46 seconds (Artificial Analysis), above the median for its price tier at 2.02 seconds. Gladly case-study figures are vendor-reported.


Snyk embeds Claude into AI Security Platform for vulnerability discovery across code and AI artifacts

Source: https://www.helpnetsecurity.com/2026/05/08/snyk-ai-security-platform/ · Help Net Security (reporting on Snyk press release) · 2026-05-08 Verification: T2 secondary · announcement · dev-tools / ai-for-business Tier nuance: Snyk press release issued via Globenewswire; Help Net Security and Yahoo Finance both published May 8 coverage. Direct snyk.io primary not retrieved in this run.

Snyk announced on 8 May 2026 that it has embedded Anthropic’s Claude models across its AI Security Platform to power automated vulnerability discovery, prioritization, and developer-ready fix generation. Coverage extends across code, open-source dependencies, containers, and AI-generated artifacts. Evo by Snyk uses Claude to continuously discover AI assets — models, agents, MCP servers, datasets, and third-party tools — across an organization’s environment, enabling AI governance workflows alongside traditional SAST and SCA scanning. The integration is available to current joint Snyk-Anthropic customers; broader rollout continues through 2026.

Why it matters for automation/productivity: Development teams using Snyk alongside Claude Code or other AI coding assistants can now surface and remediate security issues within the same AI-assisted development loop. The MCP server discovery capability within Evo is particularly relevant for teams building or adopting MCP-connected tooling, where AI-generated artifact security is a known gap.

Key claims:

Cross-references:

Caveats: No independent benchmark comparing Snyk+Claude detection rates against other AI-powered security tools was available at launch. The current-customers-only launch means new customers may face enterprise sales cycles before access. Direct snyk.io primary not fetched — verification relies on press release distribution and trade press corroboration.


Source: https://www.harvey.ai/blog/introducing-harveys-legal-agent-benchmark · Harvey · 2026-05-06 Verification: T2 verified · announcement · research-papers / ai-for-business COI: Harvey is both the benchmark creator and a commercial legal AI vendor — potential bias in task selection and rubric design.

Harvey released the Legal Agent Benchmark (LAB) on 6 May 2026, in partnership with Artificial Analysis. LAB is an open-source benchmark on GitHub containing 1,200+ agent tasks spanning 24 legal practice areas — transactional, advisory, regulatory, and litigation — with 75,000+ expert-written rubric criteria. Each task mirrors law firm associate work: a short instruction averaging 50 words, a file environment mixing relevant and peripheral documents, and a required legal deliverable. Evaluation uses all-pass grading, where partial task completion scores zero. The initial release does not include a leaderboard; baseline results and submission standards are forthcoming.

Why it matters for automation/productivity: Legal teams evaluating AI agents for document drafting, contract review, or regulatory analysis now have a domain-specific benchmark for comparing models on tasks that reflect actual law firm conditions rather than generic reasoning tests. For AI vendors pursuing legal sector sales, LAB provides a credible evaluation surface beyond general benchmarks.

Key claims:

Cross-references:

Caveats: No leaderboard published at launch; baseline results and submission standards described as forthcoming. Harvey is a commercial legal AI vendor, representing a conflict of interest in benchmark design and rubric selection. Benchmark methodology was not independently peer-reviewed at launch.


Dropped

Items considered but not published, with reason:

Title consideredSourceReason
Meta Muse Spark LLM launchai.meta.com · 2026-04-08Outside window — April 8
Anthropic Project Glasswing / Claude Mythosanthropic.com · 2026-04-07Outside window — April 7
Microsoft Agent Framework 1.0 GAdevblogs.microsoft.com · 2026-04-03Outside window — April 3
Microsoft AI Diffusion 2026 Surveyblogs.microsoft.com · 2026-05-07Already covered in 2026-05-08 radar
AWS MCP Server GAaws.amazon.com · 2026-05-06Already covered in 2026-05-08 radar
NIST CAISI classified AI evaluationnextgov.com · 2026-05-05Already covered in 2026-05-08 radar
OpenAI ChatGPT Ads Manager self-serveopenai.com · 2026-05-05Already covered in 2026-05-08 radar
Claude Managed Agents (Dreaming, Outcomes, Orchestration)anthropic.com · 2026-05-06Already covered in 2026-05-07 radar
Anthropic / SpaceX Colossus compute dealanthropic.com · 2026-05-06Already covered in 2026-05-07 radar
Anthropic Claude usage limit expansionanthropic.com · 2026-05-06Same blog as SpaceX deal; covered in 2026-05-07 radar
MCP 2026 Roadmapblog.modelcontextprotocol.io · 2026-03-09Outside window — March 9
OpenAI GPT-5.5 Instant new ChatGPT defaultopenai.com · 2026-05-05Already covered in 2026-05-07 radar
xAI Grok 4.3 API GAx.ai · API rollout 2026-04-30API rollout complete April 30 (before window); May 6 user email was notification of already-shipped change
Google Gemini 3.1 Ultradate unconfirmedNo primary source with confirmed May 6–9 publication date found; excluded pending confirmation
Jama Connect MCP Serverglobenewswire.com · 2026-05-04Outside window — May 4
Opsera-Cursor DevSecOps Agents partnershipsdtimes.com · 2026-05-08Low standalone BD signal; no primary from opsera.io; adjacent to Coder Agents item
Precisely Data Integrity Suite MCP APIssdtimes.com · 2026-05-08Enterprise data tool; low BD signal for general audience; no primary from precisely.com
Hugging Face WAVe 1B multimodal modelhuggingface.co · 2026-05-05Outside window — May 5
Microsoft Azure App Service + MAF integration blogtechcommunity.microsoft.com · 2026-05-08Original MAF shipped April 3; blog is secondary integration guidance, not a new product
Scout AI $100M Series A raisecrunchbase.com · May 2026Funding news; no primary from scoutai.com; low actionability
Rhoda AI FutureVision launchcrescendo.ai roundup · undatedIn-window date unconfirmed; no primary source found
Harvey $11B valuation funding roundharvey.ai · 2026-05-05Outside window — May 5; covered separately from LAB; funding news, low actionability
Snyk Studio (Claude Code security integration)snyk.io · 2026-02-23Outside window — February 2026; distinct from May 8 partnership announcement

Limitations


Search log (compact)

QueryYieldType
Anthropic Claude announcement May 202610 results, 5 high-relregistry
OpenAI GPT release announcement May 202610 results, 6 high-relregistry
Google DeepMind Gemini AI announcement May 202610 results, 4 high-relregistry
OpenAI GPT-Realtime voice models API release May 7 202610 results, 8 high-relregistry
MCP Model Context Protocol new server release May 202610 results, 3 high-relregistry
fetch: openai.com advancing-voice-intelligenceHTTP 403registry
fetch: techcommunity.microsoft.com MAF 1.0partial content onlyregistry
AI agent framework launch May 8 9 202610 results, 3 high-relexploratory
AI dev tools coding assistant update Cursor Claude Code May 202610 results, 4 high-relexploratory
fetch: sdtimes.com May 8 AI updatesconfirmed Coder Agents, Snykexploratory
fetch: blog.modelcontextprotocol.io 2026-mcp-roadmapMarch 9 — outside windowregistry
new AI model release May 8 9 202610 results, 3 high-relexploratory
Meta Muse Spark LLM launch date May 2026confirmed April 8 — outside windowexploratory
Anthropic Project Glasswing Claude Mythos launch date May 2026confirmed April 7 — outside windowregistry
fetch: llm-stats.com/llm-updatesGrok 4.3 API rollout confirmed April 30exploratory
fetch: thenextweb.com openai-gpt-realtime-2-voice-modelsT2 confirmed May 8 dateregistry
Gemini 3.1 Flash-Lite release date Google May 2026confirmed GA May 7–8registry
fetch: cloud.google.com/blog gemini-3-1-flash-lite GAT2 confirmed May 8registry
fetch: blog.google/innovation-and-ai gemini-3-1-flash-liteMarch 3 preview — pricing confirmedregistry
Coder Agents beta launch enterprise self-hosted May 8 2026confirmed May 6 via globenewswireexploratory
AI announcement OR AI launch May 9 202610 results, 2 high-relexploratory
fetch: anthropic.com/newsSpaceX / rate limits May 6 (covered)registry
AI Indonesia startup model rilis Mei 202610 results, 0 in-window local launchesexploratory — cross-language
OpenAI realtime voice GPT limitations criticism May 2026HIPAA gap, default reasoning confirmedadversarial
Gemini 3.1 Flash-Lite independent benchmark review criticism May 2026FACTS 40.6%, TTFT 5.46sadversarial
Coder Agents self-hosted enterprise limitation May 2026beta API instability confirmedadversarial
fetch: techcrunch.com OpenAI voice May 7 2026T2 confirmed May 7 date and pricingregistry
Snyk Claude Anthropic integration partnership launch date May 2026confirmed May 8exploratory
Harvey Legal Agent Benchmark Artificial Analysis launch date 2026confirmed May 6exploratory
fetch: harvey.ai/blog introducing-harveys-legal-agent-benchmarkT2 confirmed May 6exploratory
fetch: snyk.io/articles anthropic-launches-claude-code-securityFebruary 2026 — wrong articleexploratory
xAI Grok 4.3 release date exact May 2026API rollout April 30; email May 6registry
AI productivity tools launch release May 8 9 202610 results, 2 high-relexploratory
fetch: crescendo.ai/news latest-ai-news-and-updatesMay 4 cutoff — no May 6–9 itemsexploratory
GitHub trending AI repositories week May 2026discovery only — no primary itemsexploratory
AI startup funding launch product May 8 9 202610 results, 2 high-relexploratory
Product Hunt AI launch week May 8 202610 results, 1 high-relexploratory
xAI Grok 4.3 API pricing capabilities agentic May 2026rollout April 30 confirmedregistry

Total searches: 38, of which 16 exploratory or adversarial (42%).


Suggested next runs