Agent News
NEWS
Editorial coverage of launches, infrastructure shifts, interface upgrades, and agent tooling worth tracking. Follow the story, then jump straight into the software directory and agent profiles behind each headline.
OpenAI’s GPT-5.6 Sol is real, and its first rollout is going through Washington
OpenAI has officially previewed GPT-5.6 Sol, Terra, and Luna. The launch starts with trusted partners at the U.S. government’s request, putting frontier-model release policy inside the product story.
GPT-5.6 shows the new frontier-model problem: release speed is becoming policy
Reports say OpenAI may stage GPT-5.6 access after U.S. government security concerns. The bigger issue is whether frontier-model safety review becomes a release bottleneck for U.S. labs while open competitors keep shipping.
GPT-5.5 Instant is getting better at the messy questions people actually ask
OpenAI says a new GPT-5.5 Instant version is rolling out to paid users on June 24 and free users on June 25, with improvements to intent recognition, constraint following, conversational adaptation, shopping, and local recommendations.
OpenClaw 2026.6.10 makes the assistant feel faster without loosening the guardrails
OpenClaw 2026.6.10 is a runtime-quality release: fast mode for short conversational turns, tighter Zai and GLM routing, safer session and channel state, preserved trusted policies, and a provider onboarding fix.
OpenAI’s first chip is about inference economics, not just independence from Nvidia
OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI inference chip. The key question is not only whether it reduces GPU dependence, but whether it can lower the cost and latency of serving ChatGPT, Codex, the API, and future agent products.
Vercel Launches eve: The "Next.js for Agents" Is Here, and It's Open Source
Vercel's new filesystem-first framework treats every AI agent as a directory of files, bundling durable execution, sandboxed compute, and multi-channel deployment into a single open-source package.
OpenClaw 2026.6.9: 422 PRs of Telegram Delivery, Agent Recovery, and Codex Integration
OpenClaw's latest stable release improves Telegram HTML delivery, agent session recovery, Codex plugin approvals, and makes provider plugins standalone npm packages.
Hermes Agent v0.17 pushes agent work beyond the terminal
Hermes Agent v0.17.0 expands the open-source agent runtime with iMessage via Photon, Raft, background subagents, image editing, dashboard profile building, automation templates, managed scope, and a broad security pass.
AgentRiot Adds Loop Directory: Share Your Agent Loops with the Community
A new directory for agent loops is live on AgentRiot. Operators can now post loops they want to share, browse community submissions, and discover reusable patterns.
Loop Engineering: The Complete Guide to Building Self-Improving AI Agents
Stop prompting your coding agents one shot at a time. Here is how to design loops that prompt them for you—and when the extra complexity is worth it.
GLM-5.2: The First Open-Weights Model to Close the Gap on Long-Horizon Coding
Z.AI releases 744B-parameter MoE with 1M context, MIT license, and benchmark scores that put it within single-digit points of Claude Opus 4.8 on long-horizon coding tasks.
Cursor Origin Moves the AI Coding Fight From the Editor to the Git Forge
Cursor announced Origin, a Git forge and code-hosting product for teams and AI agents, with a fall 2026 waitlist. The official launch page is sparse, but the surrounding keynote and docs show the strategy: Cursor wants to control review, conflicts, merge readiness, and repository workflow for agent-generated code.
OpenClaw v2026.6.8 Released: 373 Commits of Richer Channels, Safer Routing, and Reliable Agents
OpenClaw ships v2026.6.8 with 185 merged PRs and 373 commits. Highlights include richer Telegram and WhatsApp delivery, safer model routing with GLM-5.2 and Claude Haiku 4.5, more reliable agent execution, native usage footers, and improved memory resilience.
SpaceX Is Buying Cursor. The Target Is Codex.
SpaceX’s $60B Cursor deal is not just an AI coding acquisition. It gives xAI a real developer surface, a feedback loop, and a direct path to challenge OpenAI Codex.
Anthropic's Fable 5: The Safest Model Nobody Can Use
Anthropic launched Claude Fable 5, its most capable model ever, on June 9. Four days later, the US government forced a complete shutdown over a potential jailbreak. Here's what happened, why developers were already angry, and what it means for frontier AI deployment.
Kimi-K2.7-Code: Moonshot Open-Sources 1T Coding Model with Strong Agentic Gains
Moonshot AI released Kimi-K2.7-Code today, an open-weight 1T-parameter MoE model showing +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite over K2.6, with 30% lower reasoning token usage.
OpenClaw v2026.6.6 Tightens Security Boundaries Across MCP, Codex, and Channel Delivery
OpenClaw released v2026.6.6 today with 48 commits focused on hardening security boundaries and improving channel reliability across Telegram, iMessage, browser automation, and MCP.
xAI Opens the Grok Build Plugin Marketplace with MongoDB, Vercel, and Chrome DevTools at Launch
xAI turns its terminal-based coding agent into an extensible platform, shipping a built-in marketplace with plugins from six major vendors including MongoDB, Vercel, Sentry, Chrome DevTools, Cloudflare, and Superpowers.
Claude Fable 5: The Backlash
Claude Fable 5 launched today. Within hours, the community was split between praising its capabilities and condemning its silent safeguards, subscription cliff, and burn rate. This is what the backlash looks like.
OpenClaw v2026.6.5 Ships with Channel Hardening, Provider Fixes, and New CalVer Numbering
OpenClaw's first monthly patch under the new YYYY.M.PATCH scheme fixes QQBot reasoning leaks, Matrix voice and thread handling, Anthropic extended-thinking recovery, MCP tool-result coercion, and moves auth and state into SQLite.
Claude Fable 5 and Claude Mythos 5: Anthropic Splits Its Frontier Tier
Anthropic shipped its first public Mythos-class model. Fable 5 is generally available with safety fallbacks; Mythos 5 is the same model with safeguards lifted, gated to Project Glasswing. Includes benchmarks, pricing, access windows, and independent skeptical testing.
Hermes Agent v0.16.0: The Surface Release Puts a Native Desktop App in Your Hands
Nous Research shipped 874 commits, 542 merged PRs, and a brand-new Electron desktop app in one week. Hermes v0.16 adds native macOS/Linux/Windows GUI, full web admin panel, leaner default skills, fuzzy model picker, /undo, and Simplified Chinese support.
xAI Ships Grok Build 0.1: A Purpose-Built Coding Model Enters the Agentic Race
xAI's new grok-build-0.1 model is now available via API. 256K context, 100+ tok/s, and a 70.8% SWE-bench score. Here is what the benchmarks, reviews, and pricing actually say.
OpenClaw 2026.6.1 Ships Skill Workshop Governance, Workboard Orchestration, and SQLite State
OpenClaw 2026.6.1 introduces Skill Workshop for governed skill lifecycles, Workboard orchestration for multi-agent workflows, SQLite-backed state, bounded provider requests, and external Copilot/Tokenjuice plugin packaging. We also look at what's cooking in the 2026.6.2 beta.
Nous Research Drops Hermes Desktop: A Native App for the Self-Improving AI Agent
The open-source Hermes Agent, already running in terminals, Discord servers, and Telegram chats, now has a polished native desktop client. Public preview is live for macOS, Windows, and Linux with streaming chat, side-by-side previews, voice, and full config portability.
MiniMax M3 Drops: Open-Weight Model With 1M Context, Frontier Coding Scores, and a Price Tag That Undercuts Closed Rivals
MiniMax shipped M3 on June 1, 2026 — the first open-weight model to combine frontier coding, a 1-million-token context window, and native multimodality. Here's the full benchmark picture, pricing breakdown, and where to access it.
OpenAI Codex Evolves Into a Full Developer Workstation: Computer Use, Goals, and 90+ Plugins
OpenAI's Codex has shifted from a coding assistant to a persistent autonomous workstation. Between April and May 2026, the desktop app gained background computer use, an in-app browser, image generation, persistent memory, and 90+ plugins. The CLI added Goal Mode, Vim editing, MCP improvements, and a Python SDK with first-class auth. Here's what changed and what it means for developers.
Hermes Agent’s Velocity Release Turns the CLI Into a Multi-Agent Workbench
Hermes Agent v0.15.0, tagged v2026.5.28, is less about one flashy feature than a broader shift: a smaller agent core, stronger Kanban orchestration, faster local recall, promptware defenses, Bitwarden secrets, and a larger plugin surface.
Claude Opus 4.8 Is Anthropic’s New Agent Benchmark, With One Clear Caveat
Anthropic’s Claude Opus 4.8 release is less about a new chat personality and more about long-running agent work: stronger SWE-bench Pro results, better tool use, 1M-token context, mid-conversation system messages, cheaper fast mode, and Claude Code dynamic workflows.
OpenClaw 2026.5.26 Makes the Agent Gateway Faster, Safer, and Easier to Inspect
OpenClaw’s v2026.5.26 release is a production-focused May rollup: faster Gateway and reply paths, first-class transcript handling, better voice/Talk runtime state, safer content boundaries, steadier Codex/provider behavior, stronger channel reliability, and clearer observability for operators.
OpenClaw v2026.5.22: Performance Gains, Meeting Notes, and 100+ Fixes
OpenClaw's May 2026 release delivers major gateway performance improvements, a new Meeting Notes plugin with Discord voice support, expanded platform coverage, and over 100 bug fixes across agents, channels, and tooling.
Two AI Agent Security Incidents in One Week Show the Field's Growing Pains
TrapDoor hijacks AI coding assistants through supply chain malware. Composio gets breached via an internal AI agent. Here's what happened and what to do.
OpenClaw 2026.5.20 Ships Discord Voice Follow-Mode, Headless xAI OAuth, and a Security-First Policy Engine
OpenClaw dropped version 2026.5.20 on May 21, 2026. The release spans 208 commits and adds Discord voice session mobility, device-code xAI OAuth for headless setups, a bundled Policy plugin that catches plaintext secrets, and a full Android v2 overhaul.
xAI Brings Grok OAuth to Coding Agents and Personal Assistants
xAI is adding OAuth support to open-source agents. Your X Premium or SuperGrok subscription now works inside Hermes, OpenClaw, and OpenCode with a single login.
How OpenAI Actually Uses Codex Internally: 7 Workflows and the Rules That Make Them Work
OpenAI published a rare look at how its own engineers use Codex day-to-day. The PDF reveals seven specific workflows, direct quotes from engineers across six teams, and six prescriptive best practices that govern how the company treats its own AI coding tool.
Cursor Composer 2.5 Hits 63.2% on CursorBench for Just $0.55 Per Task
Cursor shipped Composer 2.5 with major intelligence and behavior improvements. It scores 63.2% on CursorBench 3.1 at an average cost of $0.55 per task, undercutting frontier models by 5x to 20x while delivering comparable performance.
Google Ships Gemini 3.5 Flash, Kills Gemini CLI, and Triples the Price
Google launched Gemini 3.5 Flash at I/O 2026 with strong benchmark numbers and a new Antigravity platform, but developers are angry about a 3x price hike, sky-high token usage, and the sudden sunset of Gemini CLI on June 18.
OpenClaw v2026.5.18: Real-Time Android Voice, Typed Tool Plugins, and a Faster Gateway
OpenClaw v2026.5.18 ships real-time voice sessions on Android, a new typed tool plugin SDK, faster gateway restarts, and a redesigned Mac settings experience. Here is what changed and why it matters.
Hermes Agent v0.14.0 Turns Grok Subscriptions Into Agent Infrastructure
Hermes Agent v0.14.0 adds a wide installation and performance release, while xAI now lets Grok subscribers connect Grok 4.3, text-to-speech, and Grok Imagine directly inside Hermes.
Grok Build Is Here, But It Costs $300 a Month
xAI launched Grok Build, a terminal-based AI coding agent with plugins, subagents, and Claude Code compatibility. But at $300 per month behind the SuperGrok Heavy paywall, the pricing may kill its chances with everyday developers.
OpenClaw v2026.5.12: Leaner Installs, Resilient Telegram, and Smoother Codex
OpenClaw v2026.5.12 externalizes major dependencies, hardens Telegram polling, smooths Codex auth and MCP handling, and tightens plugin install reliability.
AI-Assisted Zero-Day Exploit Discovered in the Wild: What You Need to Know
Google's Threat Intelligence Group confirms cybercriminals used AI to discover and weaponize a real zero-day vulnerability, marking the first confirmed case of AI-assisted exploitation in the wild.
Open Design turns Claude Design’s artifact loop into an open-source local workflow
Open Design is a local-first, Apache-2.0 design studio that uses your existing coding-agent CLI, file-based skills, portable design systems, and a sandboxed artifact preview loop to generate pages, decks, apps, documents, and media.
Google’s rumored “Omni” model is a leak, not a launch
A leaked Gemini UI string points to a possible Google “Omni” video-generation model or feature, but Google has not officially announced it. Here is what is sourced, what is speculation, and what to watch at I/O 2026.
Hermes Agent v0.13.0 (2026.5.7) -- The Tenacity Release
Hermes Agent v0.13.0, published May 7, 2026, is the Tenacity Release: durable Kanban boards, the new /goal command, restart-resilient sessions, script-only cron watchdogs, Checkpoints v2, stronger default security, video analysis, voice cloning, Google Chat, provider plugins, and seven localized gateway and command-line message sets.
OpenClaw 2026.5.7 tightens channels, cron, voice, and supervised agent runs
OpenClaw 2026.5.7 is a maintenance-heavy release, but the details matter: clearer channel commands, more observable cron state, safer memory and command authorization, better Discord voice diagnostics, and fixes for Telegram, WhatsApp, coding-provider approvals, plugins, sessions, and model providers.
AgentRiot is live: a public index for agents, tools, prompts, and updates
AgentRiot is live as a public index for working agents, the tools they run on, reusable prompts, and the updates that show whether a project is moving.

