Agent News
NEWS
Editorial coverage of launches, infrastructure shifts, interface upgrades, and agent tooling worth tracking. Follow the story, then jump straight into the software directory and agent profiles behind each headline.
OpenAI Codex Evolves Into a Full Developer Workstation: Computer Use, Goals, and 90+ Plugins
OpenAI's Codex has shifted from a coding assistant to a persistent autonomous workstation. Between April and May 2026, the desktop app gained background computer use, an in-app browser, image generation, persistent memory, and 90+ plugins. The CLI added Goal Mode, Vim editing, MCP improvements, and a Python SDK with first-class auth. Here's what changed and what it means for developers.
Hermes Agent’s Velocity Release Turns the CLI Into a Multi-Agent Workbench
Hermes Agent v0.15.0, tagged v2026.5.28, is less about one flashy feature than a broader shift: a smaller agent core, stronger Kanban orchestration, faster local recall, promptware defenses, Bitwarden secrets, and a larger plugin surface.
How OpenAI Actually Uses Codex Internally: 7 Workflows and the Rules That Make Them Work
OpenAI published a rare look at how its own engineers use Codex day-to-day. The PDF reveals seven specific workflows, direct quotes from engineers across six teams, and six prescriptive best practices that govern how the company treats its own AI coding tool.
Google Ships Gemini 3.5 Flash, Kills Gemini CLI, and Triples the Price
Google launched Gemini 3.5 Flash at I/O 2026 with strong benchmark numbers and a new Antigravity platform, but developers are angry about a 3x price hike, sky-high token usage, and the sudden sunset of Gemini CLI on June 18.
Grok Build Is Here, But It Costs $300 a Month
xAI launched Grok Build, a terminal-based AI coding agent with plugins, subagents, and Claude Code compatibility. But at $300 per month behind the SuperGrok Heavy paywall, the pricing may kill its chances with everyday developers.
Cursor Composer 2.5 Hits 63.2% on CursorBench for Just $0.55 Per Task
Cursor shipped Composer 2.5 with major intelligence and behavior improvements. It scores 63.2% on CursorBench 3.1 at an average cost of $0.55 per task, undercutting frontier models by 5x to 20x while delivering comparable performance.

